Usage-Based Billing for AI SaaS with Stripe Meters

Your flat monthly subscription is quietly destroying your margins. A customer who runs 500,000 tokens through your app costs ten times more to serve than one who runs 50,000 — but your revenue doesn’t reflect that. Usage-based billing for AI SaaS with Stripe is the fix, but the path from “charge per token” to “this works in production at 2 AM” is full of outdated tutorials, isolated API docs, and one breaking change most developers still haven’t heard about.

This guide covers the complete flow end-to-end: Meter creation, Price attachment, Subscription setup, async token reporting, bill-shock prevention, and daily reconciliation — all in Node.js, all on the current API. No stitching together six doc pages. No three-infrastructure-component setups. Just the code you need to ship.

Why Usage-Based Billing for AI SaaS Beats Flat Subscriptions

The math is simple. Your LLM API costs scale with usage. Your infrastructure costs scale with usage. But your revenue — if you’re on a flat plan — doesn’t.

Hybrid pricing models that combine a subscription base with usage components deliver approximately 21% median revenue growth versus roughly 13% for pure subscription models (OpenView Partners, cited by Maxio). The market has moved: 67% of SaaS companies now use consumption-based pricing, up from 52% in 2022 (Maxio 2025 SaaS Pricing Trends Report). And yet 41% of companies with AI features are still giving them away for free, not formally monetizing them at all (High Alpha 2025 SaaS Benchmarks Report).

Token billing specifically solves three problems flat subscriptions can’t:

Cost alignment: You charge more when you deliver more. Margins stay consistent regardless of how heavily customers use your LLM features.
Upsell signal: Heavy users who consistently burn tokens are your clearest expansion revenue signal.
Fairness perception: Light users don’t subsidize power users — which reduces churn among your smaller accounts.

The Four Stripe Objects You Need to Understand Before Writing Any Code

Most Stripe billing confusion comes from not understanding how the objects chain together. Before you write a line of code, internalize this map:

Meter — Defines what you’re measuring (`ai_tokens_used`) and how to aggregate it (sum, max, count). One Meter per billable dimension.
Price — The billing rate linked to a Meter. This is where you set “charge $0.04 per 1,000 tokens.” A metered Price is invalid without a Meter backing it.
Subscription — Attaches a Customer to a Price. At billing period end, Stripe totals the Meter’s events for that customer and generates an invoice automatically.
Meter Event — Individual usage records you push to Stripe. Each carries a `customer_id`, a `value` (token count), and a timestamp.

The chain is: Meter → Price → Subscription → Meter Events.

Here’s the most important thing in this entire post: Stripe’s legacy metered billing API — the `usage_type: metered` pattern with `usage_records` — was removed in API version `2025-03-31.basil`. If a tutorial doesn’t mention a Meter object, it will throw errors on the current API. Every metered Price now requires a backing Meter. Any guide predating mid-2025 is broken.

Step 1 — Create Your Meter and Attach It to a Price in Node.js

Install the Stripe SDK:

“`bash

npm install stripe

“`

Create your Meter once and store the returned ID:

“`javascript

import Stripe from ‘stripe’;

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY, {

apiVersion: ‘2025-03-31.basil’,

});

const meter = await stripe.billing.meters.create(

{

display_name: ‘AI Tokens Used’,

event_name: ‘ai_tokens_used’,

default_aggregation: { formula: ‘sum’ },

customer_mapping: {

event_payload_key: ‘stripe_customer_id’,

type: ‘by_id’,

value_settings: { event_payload_key: ‘value’ },

{

idempotencyKey: ‘meter-create-ai-tokens-v1’,

}

);

console.log(‘Meter ID:’, meter.id); // Store this in your env config

“`

Note the `idempotencyKey` on the API call — it prevents duplicate Meters if this setup script runs twice. Now create a Price that uses `transform_quantity` to express fractional cent pricing. Stripe prices are in the smallest currency unit (cents), so $0.04 per 1,000 tokens = 4 cents per 1,000 tokens:

“`javascript

const price = await stripe.prices.create(

{

currency: ‘usd’,

recurring: {

interval: ‘month’,

usage_type: ‘metered’,

meter: meter.id, // ← required on current API

unit_amount: 4, // 4 cents = $0.04

transform_quantity: {

divide_by: 1000,

round: ‘up’,

product_data: {

name: ‘AI Token Usage’,

{ idempotencyKey: ‘price-create-ai-tokens-v1’ }

);

console.log(‘Price ID:’, price.id); // Store in STRIPE_TOKEN_PRICE_ID

“`

This bills $0.04 per 1,000 tokens, rounding up to the nearest 1,000. Store both IDs in your environment config — you’ll reference them throughout the rest of the implementation.

Step 2 — Subscribe a Customer and Wire Up the Checkout Flow

When a user activates your AI tier, create a Stripe Customer and a Subscription referencing your token Price:

“`javascript

async function createBillingSubscription(userId, email) {

const customer = await stripe.customers.create(

{ email, metadata: { app_user_id: userId } },

{ idempotencyKey: `customer-create-${userId}` }

);

// Persist customer.id → userId mapping in your database

const subscription = await stripe.subscriptions.create(

{

customer: customer.id,

items: [{ price: process.env.STRIPE_TOKEN_PRICE_ID }],

payment_behavior: ‘default_incomplete’,

payment_settings: { save_default_payment_method: ‘on_subscription’ },

expand: [‘latest_invoice.payment_intent’],

{ idempotencyKey: `subscription-create-${userId}` }

);

return {

subscriptionId: subscription.id,

clientSecret: subscription.latest_invoice.payment_intent.client_secret,

};

}

“`

Return the `clientSecret` to your frontend to collect payment with Stripe Elements. Once the subscription is active, Stripe automatically aggregates Meter Events for that customer and invoices them at period end — no cron job required on your side for the billing calculation itself.

Step 3 — Report Token Usage Asynchronously (The Right Way)

Here’s where most implementations fail. Developers call `stripe.billing.meterEvents.create()` inline inside their LLM response handler. Don’t.

Inline event reporting adds 50–200ms of Stripe API latency to every AI response. If Stripe has a hiccup, your LLM call errors. You lose idempotency control. You’ll hit rate limits on high-traffic endpoints.

The correct pattern: write token counts to your own database first, then flush to Stripe via a background worker.

Write to your own store first

“`javascript

async function recordTokenUsage(userId, promptTokens, completionTokens) {

const totalTokens = promptTokens + completionTokens;

const eventId = `usage-${userId}-${Date.now()}-${crypto.randomUUID()}`;

await db.usageEvents.create({

id: eventId,

userId,

tokens: totalTokens,

status: ‘pending’,

createdAt: new Date(),

});

return totalTokens; // Return immediately — no Stripe call yet

}

“`

Flush to Stripe with a background worker

Using BullMQ (or any queue of your choice):

“`javascript

import { Worker } from ‘bullmq’;

const usageWorker = new Worker(‘stripe-usage-flush’, async (job) => {

const { eventId, stripeCustomerId, tokens, createdAt } = job.data;

// Guard against the 35-day timestamp window (see Reconciliation section)

const MAX_AGE_MS = 34 24 60 60 1000;

if (Date.now() – new Date(createdAt).getTime() > MAX_AGE_MS) {

await db.usageEvents.update({ id: eventId, status: ‘expired’ });

return;

}

await stripe.billing.meterEvents.create(

{

event_name: ‘ai_tokens_used’,

payload: {

stripe_customer_id: stripeCustomerId,

value: String(tokens),

timestamp: Math.floor(Date.now() / 1000),

{ idempotencyKey: `meter-event-${eventId}` }

);

await db.usageEvents.update({ id: eventId, status: ‘flushed’ });

});

“`

Your internal database is the source of truth. Stripe is the billing destination. That separation is what makes your billing infrastructure survive a 2 AM outage.

Handling High Volume: When to Use the v2 Meter Event Stream API

For most early-stage products, the standard `meterEvents.create()` approach works fine. But if you’re processing thousands of concurrent requests — a popular coding assistant, a document batch processor — you’ll want Stripe’s v2 High-Throughput Meter Event Stream API.

The v2 stream supports up to 10,000 events per second in live mode (up to 200,000/second via Stripe Sales), far above the standard API’s rate limits. Authentication uses stateless session tokens that expire after 15 minutes, so you’ll need a refresh mechanism:

“`javascript

// Cache this token for up to 14 minutes

async function getMeterEventSession() {

const session = await stripe.v2.billing.meterEventSession.create();

return session.authentication_token;

}

async function streamMeterEvent(authToken, customerId, tokens) {

await stripe.v2.billing.meterEventStream.create(

{

events: [{

event_name: ‘ai_tokens_used’,

payload: {

stripe_customer_id: customerId,

value: String(tokens),

}],

{ headers: { Authorization: `Bearer ${authToken}` } }

);

}

“`

When should you switch? If your flush worker consistently queues more than 500 events per cycle, or you’re seeing rate limit errors on the standard API, it’s time to move to the stream.

Preventing Bill Shock with Stripe Usage Alerts

78% of IT leaders have experienced unexpected charges on a SaaS bill due to consumption-based pricing models (Zylo 2026 SaaS Management Index). Bill shock kills customer trust — and drives chargebacks. Stripe’s usage-based alerts let you warn customers before they cross a threshold.

Set up an alert for each new subscriber:

“`javascript

async function createUsageAlert(stripeCustomerId, thresholdInDollars) {

const alert = await stripe.billing.alerts.create({

title: `$${thresholdInDollars} token usage alert`,

alert_type: ‘usage_threshold’,

usage_threshold: {

gte: thresholdInDollars * 100, // Stripe works in cents

meter: process.env.STRIPE_METER_ID,

recurrence: ‘one_time’,

customer: stripeCustomerId,

});

return alert.id;

}

“`

When the threshold triggers, Stripe fires a `billing.alert.triggered` webhook. Handle it to notify your customer:

“`javascript

case ‘billing.alert.triggered’: {

const { customer } = event.data.object;

await sendUsageWarningEmail(customer);

break;

}

“`

For power users, consider pausing the subscription or offering a plan upgrade when the alert fires. Don’t let your most active customers become your angriest ones.

Surviving the Unexpected: Dropped Events, Idempotency, and Daily Reconciliation

Production billing fails quietly. Events get dropped. Timestamps expire. Workers crash mid-flush.

Here’s how to catch each failure mode before it becomes a billing dispute.

Idempotency at every layer

Idempotency isn’t just “add a key to the Stripe API call.” You need it in three places:

Stripe API calls — Always pass `idempotencyKey` on `meterEvents.create()`. Use a stable key derived from your internal event ID.
Webhook deduplication — Stripe may deliver the same event more than once. Store processed event IDs: `if (await db.webhookEvents.exists(event.id)) return res.sendStatus(200);`
Internal aggregation — If your worker retries a failed job, it must not double-count. The `status: ‘flushed’` pattern above handles this — check status before flushing.

The 35-day timestamp window

Stripe rejects Meter Events with timestamps older than 35 calendar days or more than 5 minutes in the future (Stripe Documentation). If your worker has a backlog older than 35 days, those events drop silently — no error thrown. The expiry guard in the worker above catches this, but you should also alert your ops team when events expire unprocessed.

Daily reconciliation

Run a nightly job that compares your internal token totals against Stripe’s meter event summaries:

“`javascript

async function reconcileCustomer(stripeCustomerId, startOfDay, endOfDay) {

const internalTotal = await db.usageEvents.sumTokens({

stripeCustomerId,

status: ‘flushed’,

createdAt: { gte: startOfDay, lte: endOfDay },

});

const summary = await stripe.billing.meters.listEventSummaries(

process.env.STRIPE_METER_ID,

{

customer: stripeCustomerId,

start_time: Math.floor(startOfDay.getTime() / 1000),

end_time: Math.floor(endOfDay.getTime() / 1000),

}

);

const stripeTotal = summary.data.reduce(

(sum, s) => sum + s.aggregated_value,

);

if (Math.abs(internalTotal – stripeTotal) > 100) {

await alertOpsTeam({ stripeCustomerId, internalTotal, stripeTotal });

}

“`

The most common causes of discrepancies: wrong `stripe_customer_id` in the event payload, timestamps outside the 35-day window, events rate-limited without retry, or workers that crashed between updating the DB and calling Stripe.

A note on private preview features

Two Stripe features are worth knowing about but are not generally available:

LLM Token Billing / AI Gateway — Auto-syncs pricing for OpenAI, Anthropic, and Google models. Stripe handles token counting and billing automatically from model call data. Requires private preview access.
Token Meter SDK — Client-side instrumentation that feeds the same Meter infrastructure. Also private preview.

Everything in this guide uses the generally available Meter API — no waitlist, no sales call required. The private preview features are worth requesting access to for the long term, but you can ship real usage-based billing for your AI SaaS with Stripe today.

Ship It, Then Harden It

Usage-based billing for AI SaaS with Stripe becomes manageable once you understand how the four objects connect and where production failures happen. The patterns that matter most: always use a Meter object with API version `2025-03-31.basil`, never call Stripe inline in your LLM handler, guard every layer with idempotency, and run daily reconciliation so discrepancies surface before they reach your customers.

59% of software companies expect usage-based revenue to grow as a share of total revenue in 2025 — an 18-point jump from 2023 (Monetization Monitor, cited by Revenera). The infrastructure to support that growth is available today, and you now have the code to wire it up.

Implement one section at a time in your staging environment, verify it end-to-end with test mode Stripe keys, then move to the next. Billing infrastructure is the kind of thing you want battle-tested before it’s handling real customer money — start with Meter creation and a single test subscription, and build outward from there.

Why Usage-Based Billing for AI SaaS Beats Flat Subscriptions

The Four Stripe Objects You Need to Understand Before Writing Any Code

Step 1 — Create Your Meter and Attach It to a Price in Node.js

Step 2 — Subscribe a Customer and Wire Up the Checkout Flow

Step 3 — Report Token Usage Asynchronously (The Right Way)

Write to your own store first

Flush to Stripe with a background worker

Handling High Volume: When to Use the v2 Meter Event Stream API

Preventing Bill Shock with Stripe Usage Alerts

Surviving the Unexpected: Dropped Events, Idempotency, and Daily Reconciliation

Idempotency at every layer

The 35-day timestamp window

Daily reconciliation

A note on private preview features

Ship It, Then Harden It

Leave a Reply Cancel reply

Related Posts

AI Coding Tool Stack 2026: Copilot, Cursor & Claude Code

AI Agent Context Engineering: 8 Codebase Patterns

LLM Structured Outputs: OpenAI, Anthropic & Gemini

How to Build AI-Generated Code Quality Gates in CI/CD