Back to Dashboard

xAIGrok 4.20VSxAIGrok 4.3

Analysis by:the whichllmmodel Editorial Team|Updated: June 2026

Our Take

Standardized reasoning and coding benchmarks are currently pending verification for one or both of these models. We recommend testing Grok 4.20 and Grok 4.3 directly in their respective provider playgrounds to see which fits your specific prompt styles best.

Was this recommendation helpful?

Model Specs

Grok 4.20

Benchmarks & Scores

Coding (swe-bench-pro)

51.8%

complex codebases, multi-file repositories, and architectural planning

Reasoning (gpqa-diamond)

90%

graduate-level science QA

Cost & Context

Cost (per 1M tokens)

$3.00Input: $2.00 | Output: $6.00

Context Window

1.05M tokens

Model Specs

Grok 4.3

Benchmarks & Scores

Coding (swe-bench-pro)

N/A

complex codebases, multi-file repositories, and architectural planning

Reasoning (gpqa-diamond)Winner (+0.1%)

90.1%

graduate-level science QA

Cost & Context

Cost (per 1M tokens)1.9x cheaper

$1.56Input: $1.25 | Output: $2.50

Context Window

1.05M tokens

Read our data collection methodology

Frequently Asked Questions about Grok 4.20 vs Grok 4.3

Grok 4.3 is cheaper than Grok 4.20. Grok 4.3 has a blended cost of $1.56/1M tokens, which is about 1.9x cheaper than Grok 4.20 at $3.00/1M tokens.

Related Matchups

Explore similar comparisons for Grok 4.20 and Grok 4.3.

Browse More Comparisons

OpenAIGPT-5.6 Sol

OpenAIGPT-5.6 Terra

Do you want to find a model for your constraints?

Use our interactive model finder to filter LLMs by reasoning capability, coding performance, cost, and context length.

Open Model Finder