HomeReadTactics deskA founder's public framework for modeling LLM inference costs
Tactics·Jun 21, 2026

A founder's public framework for modeling LLM inference costs

George Mays published a 'napkin math' spreadsheet for calculating AI COGS. The framework offers a tactical way to compare providers like OpenAI and Groq before committing to a stack. An AI-powered…

George Mays published a 'napkin math' spreadsheet for calculating AI COGS. The framework offers a tactical way to compare providers like OpenAI and Groq before committing to a stack.

An AI-powered copywriting tool charging $29 per month can lose money if a single power user generates 50,000 words. George Mays, observing this unit economic trap, published a public framework for modeling large language model (LLM) inference costs before they scale.

The model, contained in a simple Google Sheet, is a direct response to the financial risk of building on top of third-party APIs where costs are variable and tied directly to user engagement. It provides a structure for founders to project expenses and avoid building a popular but unprofitable product.

The 5-variable cost model

Mays’s framework is built on five core inputs. Two variables model user behavior: average tokens per request and average requests per user per month. A third, monthly active users, determines scale. The final two variables are the provider's pricing: cost per million input tokens and cost per million output tokens.

The spreadsheet combines these inputs to project total monthly cost. By adjusting any single variable, a founder can see the direct impact on their COGS. For example, they can model how a 20% increase in user activity affects their bill, or calculate the savings from switching to a provider with cheaper output tokens. The primary artifact is the interactive sheet itself, allowing any founder to clone it and input their own assumptions.

Comparing providers and use cases

The model is pre-populated with pricing for several major providers, enabling direct cost comparisons. Mays’s sheet uses OpenAI's GPT-4o pricing at $5.00 per million input tokens and $15.00 per million output tokens. This is contrasted with Groq's Llama 3 70B offering, priced at a claimed $0.59 per million input tokens and $0.79 per million output tokens. The sheet also includes a simplified model for self-hosting on a platform like AWS Bedrock.

Using this framework, we can model two common indie SaaS scenarios:

  1. Low-Intensity Chatbot: A support bot that summarizes tickets. Assume 500 active users, each making 20 requests per month with 1,500 tokens per request. On GPT-4o, this costs approximately $150 per month. On Groq, the same workload costs under $10.
  2. High-Intensity Content Generator: A tool for writing blog posts. Assume 100 power users, each making 50 requests per month with 4,000 tokens per request. On GPT-4o, the monthly cost jumps to over $2,000. On Groq, it remains under $150.

These scenarios, which are illustrative, show how provider choice and use case intensity create dramatically different cost structures. The model makes these tradeoffs explicit.

What we'd change

The model's strength is its simplicity. That is also its primary limitation. A more robust analysis would account for factors beyond token price.

First, the framework ignores performance and quality. Groq's main value proposition is speed. For a real-time conversational AI, its low latency might justify its use even if costs were higher. Conversely, if a cheaper model produces lower-quality output that requires user correction, the product's value is diminished. These qualitative factors are critical to user retention but absent from a pure cost spreadsheet.

Second, the self-hosting calculation is superficial. It models inference cost but omits the significant engineering overhead. The Total Cost of Ownership for a self-hosted model includes salaries for specialized engineers, infrastructure management, and the opportunity cost of that team not building core product features. For most indie founders, this TCO makes self-hosting far more expensive than the sheet suggests.

Finally, the specific pricing data is a snapshot in a volatile market. LLM providers are in a price war. A model built today on June 2026 pricing will be obsolete by September. The framework's value is in the methodology, not the specific numbers, and it requires constant updating to remain useful.

Landing

Mays's framework is not a static cost calculator. It is a tool for thinking. For founders building with AI, the discipline of modeling unit economics is the primary defense against building a product that is popular but permanently unprofitable. The spreadsheet's value is not in its current numbers, but in forcing a confrontation with the costs of scale before capital is spent. It operationalizes financial diligence for technical founders.

The investor read

The rise of accessible LLMs has shifted the core risk for many AI SaaS startups from technical feasibility to cost of goods sold (COGS) management. Mays’s public model reflects a growing founder-level focus on unit economics over speculative growth. An investable thesis in this space now requires a clear, defensible strategy for managing inference costs, whether through provider arbitrage, model optimization, or clever product design that limits high-cost usage. A pitch deck with a 'magic AI' feature is incomplete without a slide showing the napkin math on what that magic costs per user, per month. This is the new financial literacy for AI-native founders.

Pull quote: “For founders building with AI, the discipline of modeling unit economics is the primary defense against building a product that is popular but permanently unprofitable.”

Sources · how we verified
  1. Inference cost at scale with napkin math

Every claim ties to a primary source. See our methodology.

Reported by the Maya desk on Founderr Pulse’s Tactics beat. Every factual claim is tied to a primary source and linked; anything that can’t be stood up doesn’t run. Founderr (RIKHATH LLC) is the accountable publisher and corrects in place. How we work · About · File a correction.
M
Maya

The Maya desk covers tactics: concrete playbooks, growth experiments, and operating decisions indie founders are running now. Every claim is sourced and linked. Operated by Founderr (RIKHATH LLC) See the desk →

Founderr Pulse — free & independent. The desk for people who build & back.