xAI Grok API Pricing: Every Model, Cost, and Context Window Compared (2026)

Every AI provider claims competitive pricing. Grok actually has a case.
Grok 4.1 Fast comes in at $0.20 per million input tokens with a 2 million token context window, cheaper per token than GPT-5 mini, Gemini Flash, and every Anthropic model, with more context than any of them. The catch is that xAI is the newest platform in this comparison, with the smallest developer ecosystem to show for it.
For individual use, SuperGrok runs $30/month, $10 more than ChatGPT Plus or Claude Pro, but with access to Grok 4 and the full 2M context window. Teams pay $30/seat/month for Grok Business.
This article covers the full API and subscription pricing, the server-side tool costs most developers miss, a competitive comparison across OpenAI, Anthropic, and Google, and strategies to cut token costs in production.
All pricing verified as of March 3, 2026.
TLDR
Grok 4.1 Fast costs $0.20/M input tokens and $0.50/M output tokens. Grok 4 costs $3.00/M input tokens and $15.00/M output tokens.
Grok 4.1 Fast has a 2-million-token context window, the largest available across frontier models.
SuperGrok is $30/month for individual users. Grok Business starts at $30/seat/month with team collaboration features, or $300/seat/month for the SuperGrok Heavy tier.
Memory layers like Mem0 reduce token costs by retrieving only the relevant context via semantic search rather than resending the full conversation history on every request. Mem0's published figures indicate a reduction of up to 90%.
Prompt caching is automatic and eliminates repeated prompts. The batch API saves 50% on non-real-time workloads.
Grok Model Pricing Comparison
Dimension | Grok 4 | Grok 4.1 Fast | Grok 3 | Grok 3 Mini |
|---|---|---|---|---|
Input cost/M tokens | $3.00 | $0.20 | $3.00 | $0.30 |
Output cost/M tokens | $15.00 | $0.50 | $15.00 | $0.50 |
Context window (tokens) | 256,000 | 2,000,000 | 131,072 | 131,072 |
Reasoning | Always on | Reasoning, non-reasoning | N/A | Reasoning |
Best for | Complex multi-step reasoning, coding with tool use, tasks where accuracy matters | Most workloads, long documents, large codebases, extended agent workflows | Legacy flagship model | Legacy cost-efficient model |
Why Does Grok Exist?
xAI launched Grok in November 2023, roughly six months after the company was founded. Elon Musk built xAI partly as a response to what he described as ideological drift at OpenAI (where he was an early backer) and partly to take advantage of direct distribution through X (formerly Twitter). The original Grok had real-time access to X posts, a capability the other frontier labs couldn't replicate, and the integration with X's user base gave xAI a fast path to consumer adoption without building a ChatGPT-style product from scratch.
Since then, xAI has moved quickly up the model capability rankings. Grok 3, released in February 2025, was competitive with GPT-4o and Claude 3.5 on standard benchmarks. Grok 4, released in mid-2025, added always-on reasoning. The Grok 4 Fast line, and now Grok 4.1 Fast, represents xAI's attempt to combine competitive capability with aggressive token pricing.
What Does xAI Grok Cost?
Grok 4.1 Fast costs $0.20 per million input tokens and $0.50 per million output tokens via the API. Grok 4 pricing is $3.00 per million input and $15.00 per million output. Subscription plans include a free tier with limited daily messages, SuperGrok at $30/month with Grok 4 access, and Grok Business starting at $30/seat/month for teams.
You can also access Grok through X platform plans. X Premium ($8/month) and X Premium+ ($40/month) bundle Grok access with X social features like the blue checkmark, ad revenue sharing, and ad-free browsing. These are separate products and not a substitute for a SuperGrok or API plan.
API (pay-per-token): Grok 4.1 Fast at $0.20/M input, $0.50/M output. Grok 4 at $3.00/M input, $15.00/M output.
Subscriptions: Free, SuperGrok ($30/month), Grok Business ($30/seat/month or $300/seat/month for SuperGrok Heavy), Grok Enterprise (contact sales)
X plans: X Premium at $8/month, X Premium+ at $40/month. Grok model access is bundled with X platform features.
Additional costs: Server-side tools like web search ($5/1K calls) and code execution ($5/1K calls)
How Much Does the Grok API Cost per Model?
Grok offers Grok 4 for frontier reasoning workloads, and Grok 4 Fast and Grok 4.1 Fast for cost-efficient high-volume use. The previous generation (Grok 3 and Grok 3 Mini) remains available but is no longer xAI's primary focus.
Note: Grok 4.2 is currently in public beta. Check xAI's documentation for the latest updates.
Grok 4 Pricing
Input: $3.00/M tokens
Cached input tokens: $0.75/M tokens
Output: $15.00/M tokens
Context window: 256,000 tokens
Reasoning: Always on, no
reasoning_effortparameterLive search: $25.00/1K sources
When to use: You need frontier-tier performance for complex multi-step reasoning, coding with tool use, or tasks where accuracy matters more than cost or speed.
Grok 4.1 Fast and Grok 4 Fast Pricing
Input: $0.20/M tokens
Cached input tokens: $0.05/M tokens
Output: $0.50/M tokens
Context window: 2,000,000 tokens
Variants: Grok 4.1 Fast is available in reasoning and non-reasoning versions. Grok 4 Fast is non-reasoning only.
Live search: $25.00/1K sources
When to use: Grok 4.1 Fast is the model most developers should default to, especially for long documents, large codebases, or extended agent workflows. According to xAI's release notes, Grok 4 Fast uses 40% fewer thinking tokens on average compared to Grok 4, with comparable benchmark performance on MATH-500 and HumanEval.
Grok 3 and Grok 3 Mini Pricing
Grok 3 and Grok 3 Mini are the legacy models, previously the flagship generation. They remain available via the API, but the Grok 4 family is now xAI's primary focus.
Grok 3: $3.00/M input, $15.00/M output, 131,072 token context window
Grok 3 Mini: $0.30/M input, $0.50/M output, 131,072 token context window (it outperforms Grok 3 on benchmarks at 90% lower cost)
What Do Grok's Server-Side Tools Cost?
Beyond token costs, xAI charges a per-call fee whenever Grok invokes a built-in tool: web search, code execution, or file analysis. These are billed separately on top of your token costs.
Because Grok's agent decides how many tools to call per query, costs vary with query complexity and can be difficult to predict upfront. A web research query might trigger 3–5 search calls, adding $0.015–$0.025 per query in tool fees alone.
Below is the pricing for tool invocations (per 1,000 successful calls):
Tool | Description | Cost / 1k Calls |
|---|---|---|
Web Search | Search the internet and browse web pages | $5 |
X Search | Search X posts, user profiles, and threads | $5 |
Code Execution | Run Python code in a sandboxed environment | $5 |
File Attachments | Search through files attached to messages | $10 |
Collections Search | Query your uploaded document collections (RAG) | $2.50 |
Image Understanding | Analyze images found during Web Search and X Search | Token-based |
X Video Understanding | Analyze videos found during X Search | Token-based |
Remote MCP Tools | Connect and use custom MCP tool servers | Token-based |
Custom functions (function calling) let you define your own tools that Grok can invoke. Since the function logic runs on your infrastructure, xAI charges only the token cost for the model deciding to call it. No per-invocation fee.
How Do Grok Subscription Plans Compare?
If you want to use AI for writing, research, or coding without tracking per-token costs, Grok subscription plans give you model access within usage limits. Grok offers four tiers: free, SuperGrok, Grok Business, and Grok Enterprise. X Premium and X Premium+ are separate plans that include Grok alongside X platform features.
Free Tier
Price: $0
Features: Grok 4.1 Fast access, limited daily messages, and rate-limited responses
The free tier is good for testing the model before committing to a paid plan.
SuperGrok
Price: $30/month or $300/year
Features: Grok 4 and Grok 4.1 access with higher rate limits, DeepSearch (extended research mode for complex queries), Big Brain Mode (longer reasoning chains for multi-step problems), priority routing, expanded image and video generation via Imagine 1.0, longer voice mode and companion chats
SuperGrok costs $10 more per month than ChatGPT Plus ($20/month) and Claude Pro ($20/month). Google offers Google AI Plus at $7.99/month and Google AI Pro at $19.99/month.
Grok Business
Price: $30/seat/month (SuperGrok) or $300/seat/month (SuperGrok Heavy with Grok 4 Heavy access)
Features: Grok 3, Grok 4, and Grok 4.1 for SuperGrok. Grok 4 Heavy for SuperGrok Heavy. Team collaboration features, SOC 2 compliance, and no training on data.
Grok Business at $30/seat/month matches ChatGPT Team ($30/seat/month) but is higher than Claude Team Standard ($25/seat/month) and Google Gemini Business Edition ($21/seat/month). The SuperGrok Heavy tier at $300/seat targets researchers and teams doing intensive multi-step reasoning daily.
X Premium and X Premium+
X Premium: $8/month or $84/year, with increased Grok usage limits and X features
X Premium+: $40/month or $395/year, with higher Grok access and an ad-free X experience
X plans make sense if you want both AI model access and the X social platform's features. If you only need AI for daily use, SuperGrok is the cheaper option.
Who Should Use Grok's API vs. a Subscription?
The right choice depends on whether you're building applications that need programmatic access or looking for daily AI assistance.
Use the API if you're building chatbots, agents, backend integrations, or applications that need programmatic access. API access doesn't require an X subscription, and per-token billing scales directly with actual usage.
Use SuperGrok if you're an individual who wants Grok 4 access for daily tasks like writing and research, with higher limits and no need to build on top of the model.
Use Grok Business if you're managing a team and need collaborative features. Choose the $300/seat/month SuperGrok Heavy tier only if your team needs Grok 4 Heavy's extended reasoning capabilities.
How Does Grok API Pricing Compare to OpenAI, Anthropic, and Google?
Grok positions itself with competitive per-token pricing and the largest context window currently available. Its tradeoff is ecosystem maturity: smaller developer community, less documentation, and fewer third-party integrations than OpenAI or Anthropic.
The table below shows current pricing for frontier models as of February 23, 2026. Model names and prices change frequently. Verify against each provider's official documentation before making architectural decisions.
Model | Input (/1M tokens) | Output (/1M tokens) | Context window (tokens) |
|---|---|---|---|
Grok 4 | $3.00 | $15.00 | 256K |
Grok 4.1 Fast | $0.20 | $0.50 | 2M |
OpenAI GPT-5.2 | $1.75 | $14.00 | 400K |
OpenAI GPT-5 mini | $0.25 | $2.00 | 400K |
OpenAI GPT-4.1 | $2.00 | $8.00 | 1,047,576 |
Anthropic Claude Opus 4.6 | $5.00 | $25.00 | 200K (1M token in beta) |
Anthropic Claude Sonnet 4.6 | $3.00 | $15.00 | 200K (1M token in beta) |
Anthropic Claude Haiku 4.5 | $1.00 | $5.00 | 200K |
Google Gemini 3.1 Pro | $2.00 | $12.00 | 1,048,576 |
Google Gemini 3 Flash | $0.50 | $3.00 | 1,048,576 |
At $0.20/M input and $0.50/M output, Grok 4.1 Fast undercuts OpenAI GPT-5 mini ($0.25/$2.00), Claude Sonnet 4.6 ($3.00/$15.00), and Google Gemini 3 Flash ($0.50/$3.00) on both dimensions. Its 2M token context window is also the largest available, which matters for long-document analysis or agent workflows that accumulate context over many turns.
The maturity gap is real, though. Given that Grok Enterprise only launched in January 2026, its track record for large-scale deployments is shorter than established providers. Grok makes sense for developers who are comfortable working with a newer platform and prioritize cost and context window size over ecosystem depth. If you're evaluating Anthropic specifically, see the full Anthropic Claude pricing breakdown for a direct comparison.
How Can You Reduce Grok API Costs?
The highest-impact cost reduction is replacing full conversation history with a targeted AI memory layer. Most of your token costs come from context. The more history, preferences, and system content you send with each request, the more you pay. Below, I'll explain how memory layers like Mem0 can reduce Grok API costs in practice.
Use a Memory Layer to Cut Token Costs
Every time you call the Grok API, it receives the full conversation history (system prompt, past turns, user preferences, and current query) and you pay for all of that context on every request.
Consider a 20-turn customer support conversation with a 2,000-token system prompt. You're sending approximately 18,000 input tokens per request. At Grok 4.1 Fast pricing, that's $0.0036 per request. Token costs accumulate quickly as conversations grow and as you scale to thousands of users.
Mem0 sits as a memory layer between your application and the Grok API. Instead of passing full conversation history, Mem0 extracts relevant facts from conversations, stores them as embeddings in a vector store, and retrieves only the semantically relevant memories for each new request using similarity search. Your request payload shrinks from the full chat history to a compact set of targeted facts.
Using the same 20-turn example: with Mem0, you send approximately 2,000 tokens per request (system prompt plus retrieved memories). At Grok 4.1 Fast pricing, that's $0.00045 per request, an 88% cost reduction. Mem0's published research reports up to 90% token reduction and 91% latency reduction compared to full-context conversations across their benchmarks.
Real-world deployments have reported results in a similar range: RevisionDojo and OpenNote both reported 40% token cost reductions.
Mem0 works with Grok's OpenAI-compatible API format. The OpenAI Agents SDK integration guide or OpenAI compatibility docs cover setup. Memory retrieval adds about 50ms of latency, which is negligible compared to typical LLM response times.
Other Ways to Reduce Your Grok API Costs
Beyond the memory layer, several other strategies compound the savings.
Use automatic prompt caching. xAI automatically reduces the cost of repeated API calls when you send identical context or instructions. Grok 4.1 Fast's cached rate is $0.05/M and Grok 4's is $0.75/M. To maximize cache hits, front-load static content (system prompts, few-shot examples, reference documents) and end with dynamic content.
Use the batch API for non-real-time workloads. The batch API offers 50% off all token types for requests processed asynchronously, typically within 24 hours. Batch requests don't count toward standard rate limits. This is cost-effective for embedding generation, bulk evaluations, data processing, and any task that doesn't require immediate responses.
Default to the right model for each task. Use Grok 4.1 Fast for most workloads, and reach for Grok 4 only when a task requires frontier-tier reasoning. Combining Grok 4.1 Fast's $0.20/M input with Mem0's token reduction reduces per-request costs further.
Control tool usage in agentic workflows. Design prompts to constrain unnecessary tool calls. A prompt like "Answer from your training data unless the user explicitly asks you to search" prevents $5 web search calls on queries that don't need them.
Optimize prompt length. Concise prompts use 30–50% fewer tokens. Remove unnecessary examples, redundant instructions, and verbose explanations from your system prompts.
Set spending limits. Configure daily or monthly spending caps at console.x.ai before production deployment. Hard limits prevent surprise costs when traffic spikes.
Conclusion
Grok 4.1 Fast, at $0.20 per million input tokens with a 2 million token context window, sits below every comparable frontier model on per-token cost and should be the default choice for most developers building at scale. Grok 4, at $3.00 per million input tokens, is better suited for complex reasoning tasks where accuracy matters more than cost. The tradeoff is a smaller developer ecosystem and shorter enterprise track record, which is real and worth weighing against the pricing advantage.
xAI Grok pricing is frequently updated. Check xAI's official model documentation for current rates.
For developers looking to reduce costs, adding a memory layer like Mem0 is the highest-impact change available. Try Mem0's memory layer free with 10,000 memories and 1,000 retrieval calls per month.
Subscribe To New Posts
Subscribe for fresh articles and updates. It’s quick, easy, and free.














