Case 09 · Fintech & Payments

Lago Self-Hosted Usage-Based Billing Engine

Billing & Metering Infrastructure Engineersgetlago.comProduct EngineeringData & IntegrationCloud & DevOps
← All case studies

Lago is an open-source, AI-native metering and usage-based billing platform (Ruby on Rails, AGPLv3, SOC 2 Type II). We self-hosted it on the client's cluster and wired event metering through to Stripe collection with a customer-facing usage dashboard.

The challenge

A B2B AI SaaS startup with hybrid pricing — per-seat plus per-token usage — needed accurate billing without sending raw event data through a third-party vendor, for compliance reasons. Their old invoicing couldn't track real-time consumption, so credits were calculated by hand and disputed monthly.

  • Self-hosting billing so no usage event data left the client's own infrastructure.
  • Ingesting high-volume metered events from the client's LLM gateway without losing or double-counting usage.
  • Modelling graduated, hybrid pricing and reconciling it cleanly into Stripe-collected invoices.
Our solution

We deployed Lago on the client's GCP cluster (docker compose / Kubernetes), defined a tokens_consumed billable metric, instrumented their LLM gateway to emit events via POST /api/v1/events, assigned a graduated plan, and wired Lago → Stripe with a customer-facing usage dashboard.

  • A self-hosted Lago deployment on the client's GCP cluster so all usage events stay inside their boundary.
  • Event instrumentation posting {transaction_id, external_subscription_id, code:'tokens_consumed', properties:{value}} to Lago's ingestion endpoint in real time.
  • A graduated pricing plan reconciled into Stripe, with an invoice.payment_succeeded webhook and a live customer usage dashboard.

A customized view of the system we shipped for this engagement — the components and how requests and data flow between them.

usagecollect🖥️Usage Dashboard📈LLM GatewayEvents⚙️Lago Billing(Rails)🔀Redpanda Stream🧮Graduated Rating🗄️PostgreSQLInvoices💳StripeCollection
Ruby on RailsReactPostgreSQLRedisRedpanda / KafkaStripeDockerKubernetes
Kept all usage event data inside the client's own GCP boundary for compliance.
Metered hybrid per-seat + per-token pricing accurately in real time.
Replaced hand-calculated monthly credits with automated Stripe-collected invoices, ending disputes.
Direct value addedGives the client accurate, real-time usage billing on infrastructure they own, with transparent consumption dashboards that cut invoice disputes to near zero.
Why it mattersUsage billing demands exact, real-time metering. A self-hosted open engine keeps sensitive event data in-house while still feeding clean invoices to Stripe.

Before — manual bottleneck flow

1Monthly Log PullBottleneck
Systems Admin · 4 hours

Usage logs are gathered by hand at month-end with no real-time meter.

2Spreadsheet RatingBottleneck
Billing Manager · 2 days

Hybrid charges are calculated in giant spreadsheets, introducing errors.

3Disputed InvoicesBottleneck
Finance · Recurring

Customers contest hand-built invoices, triggering monthly back-and-forth.

After — automated optimized flow

1Live Event Metering
LLM Gateway · Instant

Each token-consuming call posts a metered event to self-hosted Lago.

2Graduated Rating
Lago Engine · < 10 ms

Events are rated against the graduated hybrid plan and roll into the live invoice.

3Stripe Collection
Stripe Sync · Instant

Invoices are collected via Stripe and reflected on the customer usage dashboard.

Portrait of Nikhil Rao
Billing per-token usage used to be a spreadsheet exercise at the end of every month. Now it meters in real time inside our own cloud, and the back-and-forth with customers over their invoices has mostly gone quiet.
Nikhil Rao at OceanBlue-SC

Have a problem like this?

Tell us your goal and we'll turn it into a structured plan — from idea to stable, scalable reality.

Contact us