Miscellaneous

Miscellaneous

Grok API Pricing: Every Model, Plan & Cost (May 2026)

Context Engineering for AI Agents: How to Route Queries to Memory

TL;DR

  • Cheapest API model: Grok 4.1 Fast at $0.20/M input / $0.50/M output tokens: one of the lowest rates among frontier-tier APIs right now.

  • New flagship: Grok 4.3 (launched April 30, 2026) at $1.25/M input / $2.50/M output, with a 1M context window.

  • Free credits: xAI has offered developers up to $150/month in free API credits via the data-sharing program: verify current availability in the xAI console, as the program has been subject to changes.

Grok Model Pricing at a Glance

All prices in USD per 1 million tokens. Updated May 2026.

Model

Input $/M

Cached Input $/M

Output $/M

Context Window

Best For

grok-build-0.1

$1.00

$0.20

$2.00

256K tokens

Early-access build model

grok-4.3

$1.25

$0.20

$2.50

1M tokens

General-purpose flagship, max capability

grok-4.20-multi-agent-0309

$1.25

$0.20

$2.50

2M tokens

Multi-agent orchestration, long context

grok-4.20-0309-reasoning

$1.25

$0.20

$2.50

1M tokens

Reasoning-optimized completions

grok-4.20-0309-non-reasoning

$1.25

$0.20

$2.50

1M tokens

Standard completions

Grok 4.1 Fast ⚠️

$0.20

$0.05

$0.50

2M tokens

Cost-optimized production workloads

Grok 4 ⚠️

$3.00

$0.75

$15.00

256K tokens

Always-on reasoning, max capability

Grok 3 ⚠️

$3.00

:

$15.00

131K tokens

Legacy reasoning tasks

Grok 3 Mini ⚠️

$0.30

:

$0.50

131K tokens

Budget reasoning, legacy apps

⚠️ Grok 4.1 Fast, Grok 4, Grok 3, and Grok 3 Mini are not listed on xAI's current public pricing page: they may be legacy or enterprise-only models. Verify availability in the xAI console before building.

Grok API Pricing by Model

Grok 4.3

Grok 4.3 is xAI's current flagship as of April 30, 2026. It's xAI's most capable model as of April 30, 2026, designed for maximum intelligence and production-grade quality.

$1.25/M input tokens, $2.50/M output tokens. Context window: 1M tokens.

Compared to Grok 4 ($3.00/$15.00), Grok 4.3 is 58% cheaper on input and 83% cheaper on output: a significant improvement for the top-tier model. SuperGrok Heavy subscribers get full Grok 4.3 access today; it's rolling out to SuperGrok and X Premium+ in stages.

Grok 4.20 (Three Official Variants)

Grok 4.20 is available in three distinct variants, all priced at $1.25/M input, $2.50/M output, $0.20/M cached input:

  • grok-4.20-multi-agent-0309: 2M context window. Designed for multi-agent orchestration where multiple agents collaborate across very long contexts. The right choice for document analysis pipelines or complex agent loops.

  • grok-4.20-0309-reasoning: 1M context window. Reasoning-optimized variant for tasks that benefit from chain-of-thought and structured problem solving.

  • grok-4.20-0309-non-reasoning: 1M context window. Standard completions without reasoning overhead, best for latency-sensitive workloads.

The cached input rate of $0.20/M (84% discount vs. standard $1.25/M) applies to all three variants: valuable for agent loops where the system prompt stays constant across requests.

Note: logprobs and top_logprobs are not supported on Grok 4.20 and newer models.

Grok 4.1 Fast (Cost-Optimized)

The best price-to-performance ratio in the lineup. $0.20/M input, $0.50/M output. Context window: 2M tokens.

Cached input: $0.05/M: effectively free at scale if your prompts repeat.

Grok 4.1 Fast delivers near-flagship quality at roughly 1/15th the price of Grok 4. For most chatbot, RAG, and agent workloads, this is the right default. The 2M context window means you can fit entire codebases or long document chains in a single request.

Grok 4 and Grok 3

Both models are priced at $3.00/M input and $15.00/M output.

  • Grok 4: 256K context window, always-on reasoning. Use it when Grok 4.1 Fast isn't hitting the quality bar you need.

  • Grok 3: 131K context window. Considered a legacy model at this point: Grok 4.1 Fast outperforms it on most benchmarks at a fraction of the cost.

Grok 4 cached input: $0.75/M tokens.

Note: Grok 4 and Grok 3 may not appear on xAI's current public pricing page: they may be legacy or enterprise-only models. Verify availability in the xAI console before building on these models.

Grok 3 Mini

$0.30/M input, $0.50/M output. Context window: 131K tokens.

Grok 3 Mini outperforms Grok 3 on most benchmarks while costing 90% less. If you're on a legacy Grok 3 integration and want to cut costs without switching to the Grok 4 family, Grok 3 Mini is the move.

Grok Server-Side Tools Pricing

Tool calls are billed on top of standard token costs. A chatbot making frequent web searches will see a meaningfully higher bill than a text-only app.

Tool

Cost

Web Search

$5.00 / 1K calls

X Search

$5.00 / 1K calls

Code Execution

$5.00 / 1K calls

File Attachments

$10.00 / 1K calls

Collections Search

$2.50 / 1K calls

Image/Video Understanding

Token-based

Remote MCP Tools

Token-based

File Attachments at $10/1K calls is the most expensive tool: worth batching or caching where possible. Design workflows to minimize redundant tool calls; cache results and batch related queries together.

Grok Voice and Imagine API Pricing

Voice API

Service

Cost

Voice Agent

$3.00 / hour

TTS (Text-to-Speech)

$15.00 / 1M characters

STT Batch (Speech-to-Text)

$0.10 / hour

STT Streaming

$0.20 / hour

STT Batch at $0.10/hour is the cheapest audio option: use it for transcription pipelines that don't need real-time output.

Imagine API (Image & Video Generation)

Model / Service

Media Input

1K Resolution Output

2K Resolution Output

grok-imagine-image-quality

$0.01 / image

$0.05 / image

$0.07 / image

grok-imagine-image

$0.002 / image

$0.02 / image

$0.02 / image

grok-imagine-video

$0.01/sec + $0.002/img

$0.05 / second (480p)

$0.07 / second (720p)

Use grok-imagine-image for standard image generation at $0.02/image. Upgrade to grok-imagine-image-quality when you need higher fidelity: it costs $0.05/image at 1K or $0.07/image at 2K. Video generation: a 60-second 480p clip costs $3.00; a 720p clip costs $4.20 (plus per-frame media input costs).

Files & Collections Storage Pricing

Relevant for teams using the Collections Search (RAG) tool or storing files via the Files API. Storage costs add up quickly when indexing large document sets.

Resource

Rate

File storage

$0.025 / GiB / day

Collection storage

$0.10 / GiB / day

File downloads

$0.20 / GiB downloaded

Collection downloads

$0.20 / GiB downloaded

Collection storage is 4× more expensive than raw file storage: factor this in when sizing your RAG index. For large corpora, consider chunking and pruning aggressively to keep the indexed collection lean.

Grok Subscription Plans

Free Tier

$0/month. Roughly 10 prompts per 2-hour window. Basic access to Grok 4 and Grok 4.1 inside the X app.

Good for casual use. Not suitable for any production or research workflow.

SuperGrok Lite: $10/month

Launched March 25, 2026. Sits between the free tier and SuperGrok.

Includes: Grok 3.5, Grok Imagine (image + 480p video generation), 1 AI agent, 2× longer chats than the free tier.

Pick this if you want Grok Imagine access without paying $30. Not a developer plan.

SuperGrok: $30/month

$30/month or $300/year (~17% annual discount).

Includes: Full Grok 4 and Grok 4.1 access, 128K context, DeepSearch, Big Brain mode, voice mode, ~100 prompts per 2-hour window. Grok 4.3 is rolling out to this tier in stages.

This is the right plan for individuals who want Grok for daily writing, research, and coding: without building on the API.

SuperGrok Heavy: $300/month

$300/month. The only consumer plan with full Grok 4.3 access today.

Includes: Grok 4 Heavy, 428K context window, 16-agent parallel execution, priority routing.

Worth it only if you need Grok 4 Heavy's extended reasoning or the maximum rate limits. Most developers are better served by the API with Grok 4.1 Fast.

X Premium and X Premium+

  • X Premium: $8/month ($84/year): basic Grok access inside X, verified checkmark, ad revenue sharing. Treat Grok here as a side benefit, not the main product.

  • X Premium+: $40/month ($395/year): priority Grok access, higher throughput, ad-free X. Also receiving Grok 4.3 in stages.

Neither replaces a SuperGrok or API plan if Grok is your primary tool. Pick X Premium+ only if you actually use X's platform features.

Grok Business: $30/user/month

$30/user/month. Team collaboration, centralized billing, workspace admin controls.

Designed for teams that need shared access, usage visibility across users, and admin management. Not a substitute for the API if you're building programmatically.

Grok Enterprise

Custom pricing. Contact xAI sales.

For organizations needing SLAs, dedicated infrastructure, volume discounts, or compliance requirements.

API vs. Subscription: Which Should You Use?

Use Case

Recommended Plan

Building a chatbot or backend integration

API: Grok 4.1 Fast

Multi-turn agent with long context

API: Grok 4.20

Maximum reasoning quality, production

API: Grok 4.3

Individual daily writing and research

SuperGrok ($30/month)

Light image generation, casual use

SuperGrok Lite ($10/month)

Extended reasoning, heavy agent use

SuperGrok Heavy ($300/month)

Team collaboration, shared access

Grok Business ($30/user/month)

Already on X, want basic Grok

X Premium ($8/month)

Enterprise, compliance, SLAs

Grok Enterprise (custom)

Rule of thumb: Use the API if you're building anything programmatic. API access doesn't require an X subscription, and per-token billing scales directly with actual usage. Subscriptions are for individuals who want a chat interface, not developers building on top of the model.

How Does Grok API Pricing Compare to OpenAI and Anthropic?

Prices as of May 2026. All figures in USD per 1M tokens.

Model

Input $/M

Output $/M

Context Window

Grok 4.3

$1.25

$2.50

1M tokens

Grok 4.1 Fast

$0.20

$0.50

2M tokens

GPT-4.1

$2.00

$8.00

1M tokens

Claude Sonnet 4.6

$3.00

$15.00

1M tokens

Gemini 2.5 Pro

$1.25

$10.00

1M tokens

Grok 4.3 is 37.5% cheaper on input and 68.75% cheaper on output than GPT-4.1. Against Claude Sonnet 4.6, Grok 4.3 is 58% cheaper on input and 83% cheaper on output.

Grok 4.1 Fast is in a different category entirely: $0.20/$0.50 puts it below every comparable frontier model on per-token cost, with a 2M context window that neither GPT-4.1 nor Sonnet 4.6 matches.

The tradeoff: xAI has a smaller developer ecosystem, less mature tooling, and fewer third-party integrations than OpenAI or Anthropic. If ecosystem maturity matters for your stack, factor that in.

How to Cut Grok API Costs with Memory

The biggest hidden cost in multi-turn Grok applications isn't the model rate: it's context bloat.

Every time a user sends a new message, a naive implementation re-sends the entire conversation history. A 20-turn conversation with 500 tokens per turn adds 10,000 tokens of context to every single request. At Grok 4.3 rates, that's $0.0125 per request just for the history: before the actual prompt or response.

At scale, this compounds fast. 100,000 daily requests with 10K tokens of stale history = $1,250/day in wasted input tokens.

The fix is replacing full conversation history with compressed memory. Instead of passing every prior message, a memory layer stores what matters: user preferences, key facts, prior decisions: and injects a compact summary. A 10,000-token history becomes 200–500 tokens of structured memory.

Mem0 is a persistent memory layer for Grok that does exactly this. It integrates directly with the Grok API, compresses conversation context automatically, and can reduce token costs by up to 90% on multi-turn workloads.

Building with Grok? Mem0 cuts token costs by up to 90% by replacing full conversation history with compressed memory. Start free: 50 memories, no credit card.

Add a persistent memory layer for Grok

FAQ

What is the cheapest Grok API model?

Grok 4.1 Fast at $0.20/M input and $0.50/M output tokens. With prompt caching enabled, the cached input rate drops to $0.05/M: making it one of the cheapest frontier-adjacent APIs available in 2026. It also carries a 2M token context window, larger than most competitors at any price point.

Does xAI charge for usage guideline violations?

Yes. xAI charges a $0.05 fee per request that violates usage guidelines and is caught before generation in the Responses API. If a violation is caught during generation, standard generation costs still apply in addition to any applicable fee. Design your prompts and system instructions to stay within xAI's usage policies to avoid these charges.

Does xAI offer a free API tier?

xAI has offered up to $150/month in free API credits through its data-sharing program (note: the program has been subject to changes: verify current availability in Settings > Data Sharing in the xAI console). To access credits, enable "Share API Inputs for Model Training." New users may also receive a one-time promotional credit on signup.

What is SuperGrok and is it worth it?

SuperGrok is xAI's standalone subscription for individual users: $30/month or $300/year. It gives full access to Grok 4 and Grok 4.1, 128K context, DeepSearch, Big Brain mode, and voice mode with ~100 prompts per 2-hour window. It's worth it if you're an individual using Grok daily for writing, research, or coding and don't need API access. If you're building an application, use the API instead: per-token billing is more cost-efficient at any meaningful volume.

How does Grok 4.3 pricing compare to GPT-4.1?

Grok 4.3 costs $1.25/M input and $2.50/M output. GPT-4.1 costs $2.00/M input and $8.00/M output. That's a 37.5% input savings and a 68.75% output savings with Grok 4.3. Both models offer a 1M token context window. GPT-4.1 has a more mature developer ecosystem; Grok 4.3 wins on raw per-token cost.

What's the difference between SuperGrok and Grok Business?

SuperGrok ($30/month) is a single-user subscription for personal Grok access via the chat interface. Grok Business ($30/user/month) is a team product with centralized billing, workspace admin controls, and collaborative features. If you're a solo user, SuperGrok is the right pick. If you're managing a team that needs shared Grok access with usage visibility, Grok Business is the right tier.

How do I reduce Grok API costs at scale?

Four main levers:

  1. Use Grok 4.1 Fast for most workloads: it's 15× cheaper than Grok 4 with near-equivalent benchmark performance.

  2. Enable prompt caching: cached input on Grok 4.1 Fast drops to $0.05/M, a 75% discount. Caching is automatic on xAI's API; repeated prompt prefixes are cached without configuration.

  3. Use the Batch API: 20–50% off standard rates for non-real-time workloads (exact discount varies by model).

  4. Replace conversation history with memory: tools like Mem0 compress multi-turn context instead of passing full history, cutting input tokens by up to 90% on agent and chatbot workloads.

Does Grok API support prompt caching?

Yes. Prompt caching is automatic on the Grok API: no configuration needed. Cached token usage is visible in the API response's usage object. The cached input rate for Grok 4.1 Fast is $0.05/M tokens (vs. $0.20/M standard), and for Grok 4.3 it's $0.20/M (vs. $1.25/M standard: an 84% discount), and for all three Grok 4.20 variants it's also $0.20/M (vs. $1.25/M standard: an 84% discount). For grok-build-0.1, the cached input rate is $0.20/M (vs. $1.00/M standard: an 80% discount). This is particularly valuable for applications with consistent system prompts or repeated document context.

Useful Sources

GET TLDR from:

Summarize

Website/Footer

Summarize

Website/Footer

Summarize

Website/Footer

Summarize

Website/Footer