Google Gemini 3.1 Pro first impressions: a 'Deep Think Mini' with adjustable reasoning on demand
Essential brief
For the past three months, Google's Gemini 3 Pro has held its ground as one of the most capable frontier models available. But in the fast-moving w
Key facts
Highlights
Why it matters
For the past three months, Google's Gemini 3 Pro has held its ground as one of the most capable frontier models available. But in the fast-moving world of AI, three months is a lifetime — and competitors have not been standing still. Earlier today, Google released <a href="https://blog.google/
For the past three months, Google's Gemini 3 Pro has held its ground as one of the most capable frontier models available.
But in the fast-moving world of AI, three months is a lifetime — and competitors have not been standing still.
Earlier today, Google released Gemini 3.1 Pro , an update that brings a key innovation to the company's workhorse power model: three levels of adjustable thinking that effectively turn it into a lightweight version of Google's specialized Deep Think reasoning system.
The release marks the first time Google has issued a "point one" update to a Gemini model, signaling a shift in the company's release strategy from periodic full-version launches to more frequent incremental upgrades.
More importantly for enterprise AI teams evaluating their model stack, 3.1 Pro's new three-tier thinking system — low, medium, and high — gives developers and IT leaders a single model that can scale its reasoning effort dynamically, from quick responses for routine queries up to multi-minute deep reasoning sessions for complex problems.
The model is rolling out now in preview across the Gemini API via Google AI Studio , Gemini CLI, Google's agentic development platform Antigravity, Vertex AI, Gemini Enterprise, Android Studio, the consumer Gemini app, and NotebookLM.
The 'Deep Think Mini' effect: adjustable reasoning on demand The most consequential feature in Gemini 3.1 Pro is not a single benchmark number — it is the introduction of a three-tier thinking level system that gives users fine-grained control over how much computational effort the model invests in each response.
Gemini 3 Pro offered only two thinking modes: low and high.
The new 3.1 Pro adds a medium setting (similar to the previous high) and, critically, overhauls what "high" means.
When set to high, 3.1 Pro behaves as a "mini version of Gemini Deep Think" — the company's specialized reasoning model that was updated just last week .
The implication for enterprise deployment could be significant.
Rather than routing requests to different specialized models based on task complexity — a common but operationally burdensome pattern — organizations can now use a single model endpoint and adjust reasoning depth based on the task at hand.
Routine document summarization can run on low thinking with fast response times, while complex analytical tasks can be elevated to high thinking for Deep Think–caliber reasoning.
Benchmark Performance: More Than Doubling Reasoning Over 3 Pro Google's published benchmarks tell a story of dramatic improvement, particularly in areas associated with reasoning and agentic capability.
On ARC-AGI-2 , a benchmark that evaluates a model's ability to solve novel abstract reasoning patterns, 3.1 Pro scored 77.1% — more than double the 31.1% achieved by Gemini 3 Pro and substantially ahead of Anthropic's Sonnet 4.6 (58.3%) and Opus 4.6 (68.8%).
This result also eclipses OpenAI's GPT-5.2 (52.9%).
The gains extend across the board.
On Humanity's Last Exam , a rigorous academic reasoning benchmark, 3.1 Pro achieved 44.4% without tools, up from 37.5% for 3 Pro and ahead of both Claude Sonnet 4.6 (33.2%) and Opus 4.6 (40.0%).