Gemini 3 Pro is presented as Google’s most «reasonable» model, and the company openly positions its solution as the industry leader in many respects. Independent evaluations confirm this.
According to Artificial Analysis, the model has become the new leader of their integral index.
AI Index from Artificial Analysis. Data: Artificial Analysis.
If Artificial Analysis’ tests are to be believed, Google has taken the lead over its competitors in the areas of intelligent tasks — reasoning, understanding complex structures, accuracy and multimodality.
The performance in Deep Analysis deserves special attention. On Humanity’s Last Exam, which assesses a model’s ability to solve doctoral-level problems without tools, Gemini 3 Pro scored over 37%.
This is more than ten percentage points higher than the previous record. On ARC-AGI-2, one of the most challenging benchmarks that assesses the ability to derive rules and apply them to new situations, the model also scored above most competitors.
Results of ten specialised tests from Artificial Analysis. Data: Artificial Analysis.
The high performance is also evident in the math tests, Google stressed. In the MathArena Apex test, where questions of extreme levels of complexity traditionally take models out of balance, Gemini 3 Pro received 23.4%. Previously, this figure was unattainable for other systems, and the best results did not exceed 5.2%.
MathArena Apex test results. Data: MathArena.
In multimodal tests, the updated Gemini also takes the first positions. Experts directly attribute this to the potentially large scale of the model.
This hypothesis would explain the ability of Google’s AI to outperform products from other companies in tasks involving visual analysis and spatial understanding.
Separately, a comparison with Claude and ChatGPT is worth noting. On the SWE-Bench Verified benchmark, which tests the ability to autonomously handle GitHub tasks, the new model lags behind Sonnet 4.5 by only one percent. In other metrics, the Gemini often comes out ahead.
Comparative test results from different AI models. Data: Google.
Another important piece of evidence is the speed of the model. Artificial Analysis notes that Gemini 3 Pro generates about 128 tokens per second. This is faster than the performance of GPT-5.1, Kimi K2 Thinking, and Grok 4.
This is most likely due to Google’s own hardware platform based on Tensor Processing Unit (TPU) processors.
Thus, in a number of parameters, the model confidently competes with existing flagships and in many cases surpasses them. At the same time, the product lags behind its competitors in some tests, but usually only slightly.

Gemini 3 Pro technical data. Data: Google.
Description of the new features in Gemini 3 Pro. Data: Google.
Vending-Bench 2 test. Data: Google.


