DEVELOPERS

PRICING

USECASES

RESOURCES

DOCS

Star

home_primary_get-started

Home

Get Started

DEVELOPERS

PRICING

USECASES

RESOURCES

DOCS

home_primary_get-started

Home

Get Started

Blog

Miscellaneous

Grok API Pricing: Every Model, Token Cost, and How to Spend Less

Aashi Dutt

•

March 5, 2026

TL;DR

Cheapest API model: Grok 4.1 Fast at $0.20/M input / $0.50/M output tokens: one of the lowest rates among frontier-tier APIs right now.
New flagship: Grok 4.3 (launched April 30, 2026) at $1.25/M input / $2.50/M output, with a 1M context window.
Free credits: xAI has offered developers up to $150/month in free API credits via the data-sharing program: verify current availability in the xAI console, as the program has been subject to changes.

Grok Model Pricing at a Glance

All prices in USD per 1 million tokens. Updated May 2026.

Model	Input $/M	Cached Input $/M	Output $/M	Context Window	Best For
grok-build-0.1	$1.00	$0.20	$2.00	256K tokens	Early-access build model
grok-4.3	$1.25	$0.20	$2.50	1M tokens	General-purpose flagship, max capability
grok-4.20-multi-agent-0309	$1.25	$0.20	$2.50	2M tokens	Multi-agent orchestration, long context
grok-4.20-0309-reasoning	$1.25	$0.20	$2.50	1M tokens	Reasoning-optimized completions
grok-4.20-0309-non-reasoning	$1.25	$0.20	$2.50	1M tokens	Standard completions
Grok 4.1 Fast ⚠️	$0.20	$0.05	$0.50	2M tokens	Cost-optimized production workloads
Grok 4 ⚠️	$3.00	$0.75	$15.00	256K tokens	Always-on reasoning, max capability
Grok 3 ⚠️	$3.00	:	$15.00	131K tokens	Legacy reasoning tasks
Grok 3 Mini ⚠️	$0.30	:	$0.50	131K tokens	Budget reasoning, legacy apps

Why Grok API costs scale faster than expected
Multi-turn Grok agents often re-inject conversation history on every API call. A 20-turn session with 30K tokens of context can create 600K input tokens. Mem0 stores only durable memory and retrieves it per request.
See how persistent memory reduces LLM token costs →

⚠️ Grok 4.1 Fast, Grok 4, Grok 3, and Grok 3 Mini are not listed on xAI's current public pricing page: they may be legacy or enterprise-only models. Verify availability in the xAI console before building.

Grok API Pricing by Model

Cut your Grok API costs in production
Mem0 stores and retrieves relevant memory instead of sending full conversation history into every Grok call. It is model-agnostic and works alongside your existing LLM stack.
Start free with Mem0 →

Grok 4.3

Grok 4.3 is xAI's current flagship as of April 30, 2026. It's xAI's most capable model as of April 30, 2026, designed for maximum intelligence and production-grade quality.

$1.25/M input tokens, $2.50/M output tokens. Context window: 1M tokens.

Compared to Grok 4 ($3.00/$15.00), Grok 4.3 is 58% cheaper on input and 83% cheaper on output: a significant improvement for the top-tier model. SuperGrok Heavy subscribers get full Grok 4.3 access today; it's rolling out to SuperGrok and X Premium+ in stages.

Grok 4.20 (Three Official Variants)

Grok 4.20 is available in three distinct variants, all priced at $1.25/M input, $2.50/M output, $0.20/M cached input:

grok-4.20-multi-agent-0309: 2M context window. Designed for multi-agent orchestration where multiple agents collaborate across very long contexts. The right choice for document analysis pipelines or complex agent loops.
grok-4.20-0309-reasoning: 1M context window. Reasoning-optimized variant for tasks that benefit from chain-of-thought and structured problem solving.
grok-4.20-0309-non-reasoning: 1M context window. Standard completions without reasoning overhead, best for latency-sensitive workloads.

The cached input rate of $0.20/M (84% discount vs. standard $1.25/M) applies to all three variants: valuable for agent loops where the system prompt stays constant across requests.

Note: logprobs and top_logprobs are not supported on Grok 4.20 and newer models.

Grok 4.1 Fast (Cost-Optimized)

The best price-to-performance ratio in the lineup. $0.20/M input, $0.50/M output. Context window: 2M tokens.

Cached input: $0.05/M: effectively free at scale if your prompts repeat.

Grok 4.1 Fast delivers near-flagship quality at roughly 1/15th the price of Grok 4. For most chatbot, RAG, and agent workloads, this is the right default. The 2M context window means you can fit entire codebases or long document chains in a single request.

Grok 4 and Grok 3

Both models are priced at $3.00/M input and $15.00/M output.

Grok 4: 256K context window, always-on reasoning. Use it when Grok 4.1 Fast isn't hitting the quality bar you need.
Grok 3: 131K context window. Considered a legacy model at this point: Grok 4.1 Fast outperforms it on most benchmarks at a fraction of the cost.

Grok 4 cached input: $0.75/M tokens.

Note: Grok 4 and Grok 3 may not appear on xAI's current public pricing page: they may be legacy or enterprise-only models. Verify availability in the xAI console before building on these models.

Grok 3 Mini

$0.30/M input, $0.50/M output. Context window: 131K tokens.

Grok 3 Mini outperforms Grok 3 on most benchmarks while costing 90% less. If you're on a legacy Grok 3 integration and want to cut costs without switching to the Grok 4 family, Grok 3 Mini is the move.

Grok Server-Side Tools Pricing

Tool calls are billed on top of standard token costs. A chatbot making frequent web searches will see a meaningfully higher bill than a text-only app.

Tool	Cost
Web Search	$5.00 / 1K calls
X Search	$5.00 / 1K calls
Code Execution	$5.00 / 1K calls
File Attachments	$10.00 / 1K calls
Collections Search	$2.50 / 1K calls
Image/Video Understanding	Token-based
Remote MCP Tools	Token-based

File Attachments at $10/1K calls is the most expensive tool: worth batching or caching where possible. Design workflows to minimize redundant tool calls; cache results and batch related queries together.

Grok Voice and Imagine API Pricing

Voice API

Service	Cost
Voice Agent	$3.00 / hour
TTS (Text-to-Speech)	$15.00 / 1M characters
STT Batch (Speech-to-Text)	$0.10 / hour
STT Streaming	$0.20 / hour

STT Batch at $0.10/hour is the cheapest audio option: use it for transcription pipelines that don't need real-time output.

Imagine API (Image & Video Generation)

Model / Service	Media Input	1K Resolution Output	2K Resolution Output
`grok-imagine-image-quality`	$0.01 / image	$0.05 / image	$0.07 / image
`grok-imagine-image`	$0.002 / image	$0.02 / image	$0.02 / image
`grok-imagine-video`	$0.01/sec + $0.002/img	$0.05 / second (480p)	$0.07 / second (720p)

Use grok-imagine-image for standard image generation at $0.02/image. Upgrade to grok-imagine-image-quality when you need higher fidelity: it costs $0.05/image at 1K or $0.07/image at 2K. Video generation: a 60-second 480p clip costs $3.00; a 720p clip costs $4.20 (plus per-frame media input costs).

Files & Collections Storage Pricing

Relevant for teams using the Collections Search (RAG) tool or storing files via the Files API. Storage costs add up quickly when indexing large document sets.

Resource	Rate
File storage	$0.025 / GiB / day
Collection storage	$0.10 / GiB / day
File downloads	$0.20 / GiB downloaded
Collection downloads	$0.20 / GiB downloaded

Collection storage is 4× more expensive than raw file storage: factor this in when sizing your RAG index. For large corpora, consider chunking and pruning aggressively to keep the indexed collection lean.

Grok Subscription Plans

Free Tier

$0/month. Roughly 10 prompts per 2-hour window. Basic access to Grok 4 and Grok 4.1 inside the X app.

Good for casual use. Not suitable for any production or research workflow.

SuperGrok Lite: $10/month

Launched March 25, 2026. Sits between the free tier and SuperGrok.

Includes: Grok 3.5, Grok Imagine (image + 480p video generation), 1 AI agent, 2× longer chats than the free tier.

Pick this if you want Grok Imagine access without paying $30. Not a developer plan.

SuperGrok: $30/month

$30/month or $300/year (~17% annual discount).

Includes: Full Grok 4 and Grok 4.1 access, 128K context, DeepSearch, Big Brain mode, voice mode, ~100 prompts per 2-hour window. Grok 4.3 is rolling out to this tier in stages.

This is the right plan for individuals who want Grok for daily writing, research, and coding: without building on the API.

SuperGrok Heavy: $300/month

$300/month. The only consumer plan with full Grok 4.3 access today.

Includes: Grok 4 Heavy, 428K context window, 16-agent parallel execution, priority routing.

Worth it only if you need Grok 4 Heavy's extended reasoning or the maximum rate limits. Most developers are better served by the API with Grok 4.1 Fast.

X Premium and X Premium+

X Premium: $8/month ($84/year): basic Grok access inside X, verified checkmark, ad revenue sharing. Treat Grok here as a side benefit, not the main product.
X Premium+: $40/month ($395/year): priority Grok access, higher throughput, ad-free X. Also receiving Grok 4.3 in stages.

Neither replaces a SuperGrok or API plan if Grok is your primary tool. Pick X Premium+ only if you actually use X's platform features.

Grok Business: $30/user/month

$30/user/month. Team collaboration, centralized billing, workspace admin controls.

Designed for teams that need shared access, usage visibility across users, and admin management. Not a substitute for the API if you're building programmatically.

Grok Enterprise

Custom pricing. Contact xAI sales.

For organizations needing SLAs, dedicated infrastructure, volume discounts, or compliance requirements.

API vs. Subscription: Which Should You Use?

Use Case	Recommended Plan
Building a chatbot or backend integration	API: Grok 4.1 Fast
Multi-turn agent with long context	API: Grok 4.20
Maximum reasoning quality, production	API: Grok 4.3
Individual daily writing and research	SuperGrok ($30/month)
Light image generation, casual use	SuperGrok Lite ($10/month)
Extended reasoning, heavy agent use	SuperGrok Heavy ($300/month)
Team collaboration, shared access	Grok Business ($30/user/month)
Already on X, want basic Grok	X Premium ($8/month)
Enterprise, compliance, SLAs	Grok Enterprise (custom)

Rule of thumb: Use the API if you're building anything programmatic. API access doesn't require an X subscription, and per-token billing scales directly with actual usage. Subscriptions are for individuals who want a chat interface, not developers building on top of the model.

How Does Grok API Pricing Compare to OpenAI and Anthropic?

Prices as of May 2026. All figures in USD per 1M tokens.

Model	Input $/M	Output $/M	Context Window
Grok 4.3	$1.25	$2.50	1M tokens
Grok 4.1 Fast	$0.20	$0.50	2M tokens
GPT-4.1	$2.00	$8.00	1M tokens
Claude Sonnet 4.6	$3.00	$15.00	1M tokens
Gemini 2.5 Pro	$1.25	$10.00	1M tokens

Grok 4.3 is 37.5% cheaper on input and 68.75% cheaper on output than GPT-4.1. Against Claude Sonnet 4.6, Grok 4.3 is 58% cheaper on input and 83% cheaper on output.

Grok 4.1 Fast is in a different category entirely: $0.20/$0.50 puts it below every comparable frontier model on per-token cost, with a 2M context window that neither GPT-4.1 nor Sonnet 4.6 matches.

The tradeoff: xAI has a smaller developer ecosystem, less mature tooling, and fewer third-party integrations than OpenAI or Anthropic. If ecosystem maturity matters for your stack, factor that in.

How to Cut Grok API Costs with Memory

The biggest hidden cost in multi-turn Grok applications isn't the model rate: it's context bloat.

Every time a user sends a new message, a naive implementation re-sends the entire conversation history. A 20-turn conversation with 500 tokens per turn adds 10,000 tokens of context to every single request. At Grok 4.3 rates, that's $0.0125 per request just for the history: before the actual prompt or response.

At scale, this compounds fast. 100,000 daily requests with 10K tokens of stale history = $1,250/day in wasted input tokens.

The fix is replacing full conversation history with compressed memory. Instead of passing every prior message, a memory layer stores what matters: user preferences, key facts, prior decisions: and injects a compact summary. A 10,000-token history becomes 200–500 tokens of structured memory.

Mem0 is a persistent memory layer for Grok that does exactly this. It integrates directly with the Grok API, compresses conversation context automatically, and can reduce token costs by up to 90% on multi-turn workloads.

Building with Grok? Mem0 cuts token costs by up to 90% by replacing full conversation history with compressed memory. Start free: 50 memories, no credit card.
→ Add a persistent memory layer for Grok

Building with Grok in production? If context is driving your API bill, route readers into the token-cost fix. Read: how to reduce LLM token costs →

FAQ

What is the cheapest Grok API model?

Grok 4.1 Fast at $0.20/M input and $0.50/M output tokens. With prompt caching enabled, the cached input rate drops to $0.05/M: making it one of the cheapest frontier-adjacent APIs available in 2026. It also carries a 2M token context window, larger than most competitors at any price point.

Does xAI charge for usage guideline violations?

Yes. xAI charges a $0.05 fee per request that violates usage guidelines and is caught before generation in the Responses API. If a violation is caught during generation, standard generation costs still apply in addition to any applicable fee. Design your prompts and system instructions to stay within xAI's usage policies to avoid these charges.

Does xAI offer a free API tier?

xAI has offered up to $150/month in free API credits through its data-sharing program (note: the program has been subject to changes: verify current availability in Settings > Data Sharing in the xAI console). To access credits, enable "Share API Inputs for Model Training." New users may also receive a one-time promotional credit on signup.

What is SuperGrok and is it worth it?

SuperGrok is xAI's standalone subscription for individual users: $30/month or $300/year. It gives full access to Grok 4 and Grok 4.1, 128K context, DeepSearch, Big Brain mode, and voice mode with ~100 prompts per 2-hour window. It's worth it if you're an individual using Grok daily for writing, research, or coding and don't need API access. If you're building an application, use the API instead: per-token billing is more cost-efficient at any meaningful volume.

How does Grok 4.3 pricing compare to GPT-4.1?

Grok 4.3 costs $1.25/M input and $2.50/M output. GPT-4.1 costs $2.00/M input and $8.00/M output. That's a 37.5% input savings and a 68.75% output savings with Grok 4.3. Both models offer a 1M token context window. GPT-4.1 has a more mature developer ecosystem; Grok 4.3 wins on raw per-token cost.

What's the difference between SuperGrok and Grok Business?

SuperGrok ($30/month) is a single-user subscription for personal Grok access via the chat interface. Grok Business ($30/user/month) is a team product with centralized billing, workspace admin controls, and collaborative features. If you're a solo user, SuperGrok is the right pick. If you're managing a team that needs shared Grok access with usage visibility, Grok Business is the right tier.

How do I reduce Grok API costs at scale?

Four main levers:

Use Grok 4.1 Fast for most workloads: it's 15× cheaper than Grok 4 with near-equivalent benchmark performance.
Enable prompt caching: cached input on Grok 4.1 Fast drops to $0.05/M, a 75% discount. Caching is automatic on xAI's API; repeated prompt prefixes are cached without configuration.
Use the Batch API: 20–50% off standard rates for non-real-time workloads (exact discount varies by model).
Replace conversation history with memory: tools like Mem0 compress multi-turn context instead of passing full history, cutting input tokens by up to 90% on agent and chatbot workloads.

Does Grok API support prompt caching?

Yes. Prompt caching is automatic on the Grok API: no configuration needed. Cached token usage is visible in the API response's usage object. The cached input rate for Grok 4.1 Fast is $0.05/M tokens (vs. $0.20/M standard), and for Grok 4.3 it's $0.20/M (vs. $1.25/M standard: an 84% discount), and for all three Grok 4.20 variants it's also $0.20/M (vs. $1.25/M standard: an 84% discount). For grok-build-0.1, the cached input rate is $0.20/M (vs. $1.00/M standard: an 80% discount). This is particularly valuable for applications with consistent system prompts or repeated document context.