Close Menu
    Facebook X (Twitter) Instagram
    Articles Stock
    • Home
    • Technology
    • AI
    • Pages
      • About us
      • Contact us
      • Disclaimer For Articles Stock
      • Privacy Policy
      • Terms and Conditions
    Facebook X (Twitter) Instagram
    Articles Stock
    AI

    Google AI Releases Gemini 3.1 Professional with 1 Million Token Context and 77.1 % ARC-AGI-2 Reasoning for AI Brokers

    Naveed AhmadBy Naveed Ahmad20/02/2026Updated:20/02/2026No Comments5 Mins Read
    blog banner23 44


    Google has formally shifted the Gemini period into excessive gear with the discharge of Gemini 3.1 Professional, the primary model replace within the Gemini 3 collection. This launch is not only a minor patch; it’s a focused strike on the ‘agentic’ AI market, specializing in reasoning stability, software program engineering, and tool-use reliability.

    For devs, this replace indicators a transition. We’re shifting from fashions that merely ‘chat’ to fashions that ‘work.’ Gemini 3.1 Professional is designed to be the core engine for autonomous brokers that may navigate file programs, execute code, and motive by scientific issues with successful charge that now rivals—and in some circumstances exceeds—the trade’s most elite frontier fashions.

    Huge Context, Exact Output

    One of the vital rapid technical upgrades is the dealing with of scale. Gemini 3.1 Professional Preview maintains an enormous 1M token enter context window. To place this in perspective for software program engineers: now you can feed the mannequin a whole medium-sized code repository, and it’ll have sufficient ‘reminiscence’ to grasp the cross-file dependencies with out dropping the plot.

    Nevertheless, the actual information is the 65k token output restrict. This 65k window is a big bounce for builders constructing long-form turbines. Whether or not you’re producing a 100-page technical handbook or a fancy, multi-module Python software, the mannequin can now end the job in a single flip with out hitting an abrupt ‘max token’ wall.

    Doubling Down on Reasoning

    If Gemini 3.0 was about introducing ‘Deep Considering,’ Gemini 3.1 is about making that considering environment friendly. The efficiency jumps on rigorous benchmarks are notable:

    Benchmark Rating What it measures
    ARC-AGI-2 77.1% Capacity to unravel completely new logic patterns
    GPQA Diamond 94.1% Graduate-level scientific reasoning
    SciCode 58.9% Python programming for scientific computing
    Terminal-Bench Onerous 53.8% Agentic coding and terminal use
    Humanity’s Final Examination (HLE) 44.7% Reasoning towards near-human limits

    The 77.1% on ARC-AGI-2 is the headline determine right here. Google workforce claims this represents greater than double the reasoning efficiency of the unique Gemini 3 Professional. This implies the mannequin is way much less more likely to depend on sample matching from its coaching knowledge and is extra able to ‘figuring it out’ when confronted with a novel edge case in a dataset.

    https://weblog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/

    The Agentic Toolkit: Customized Instruments and ‘Antigravity‘

    Google workforce is making a transparent play for the developer’s terminal. Together with the primary mannequin, they launched a specialised endpoint: gemini-3.1-pro-preview-customtools.

    This endpoint is optimized for builders who combine bash instructions with customized capabilities. In earlier variations, fashions usually struggled to prioritize which device to make use of, generally hallucinating a search when a neighborhood file learn would have sufficed. The customtools variant is particularly tuned to prioritize instruments like view_file or search_code, making it a extra dependable spine for autonomous coding brokers.

    This launch additionally integrates deeply with Google Antigravity, the corporate’s new agentic improvement platform. Builders can now make the most of a brand new ‘medium’ considering stage. This lets you toggle the ‘reasoning finances’—utilizing high-depth considering for complicated debugging whereas dropping to medium or low for normal API calls to avoid wasting on latency and price.

    API Breaking Adjustments and New File Strategies

    For these already constructing on the Gemini API, there’s a small however important breaking change. Within the Interactions API v1beta, the sector total_reasoning_tokens has been renamed to total_thought_tokens. This modification aligns with the ‘thought signatures’ launched within the Gemini 3 household—encrypted representations of the mannequin’s inner reasoning that have to be handed again to the mannequin to take care of context in multi-turn agentic workflows.

    The mannequin’s urge for food for knowledge has additionally grown. Key updates to file dealing with embody:

    • 100MB File Restrict: The earlier 20MB cap for API uploads has been quintupled to 100MB.
    • Direct YouTube Help: Now you can cross a YouTube URL straight as a media supply. The mannequin ‘watches’ the video through the URL quite than requiring a handbook add.
    • Cloud Integration: Help for Cloud Storage buckets and personal database pre-signed URLs as direct knowledge sources.

    The Economics of Intelligence

    Pricing for Gemini 3.1 Professional Preview stays aggressive. For prompts beneath 200k tokens, enter prices are $2 per 1 million tokens, and output is $12 per 1 million. For contexts exceeding 200k, the value scales to $4 enter and $18 output.

    When in comparison with opponents like Claude Opus 4.6 or GPT-5.2, Google workforce is positioning Gemini 3.1 Professional because the ‘effectivity chief.’ In line with knowledge from Artificial Analysis, Gemini 3.1 Professional now holds the highest spot on their Intelligence Index whereas costing roughly half as a lot to run as its nearest frontier friends.

    Key Takeaways

    • Huge 1M/65K Context Window: The mannequin maintains a 1M token enter window for large-scale knowledge and repositories, whereas considerably upgrading the output restrict to 65k tokens for long-form code and doc era.
    • A Leap in Logic and Reasoning: Efficiency on the ARC-AGI-2 benchmark reached 77.1%, representing greater than double the reasoning functionality of earlier variations. It additionally achieved a 94.1% on GPQA Diamond for graduate-level science duties.
    • Devoted Agentic Endpoints: Google workforce launched a specialised gemini-3.1-pro-preview-customtools endpoint. It’s particularly optimized to prioritize bash instructions and system instruments (like view_file and search_code) for extra dependable autonomous brokers.
    • API Breaking Change: Builders should replace their codebases as the sector total_reasoning_tokens has been renamed to total_thought_tokens within the v1beta Interactions API to raised align with the mannequin’s inner “thought” processing.
    • Enhanced File and Media Dealing with: The API file measurement restrict has elevated from 20MB to 100MB. Moreover, builders can now cross YouTube URLs straight into the immediate, permitting the mannequin to investigate video content material without having to obtain or re-upload information.

    Try the Technical details and Try it here. Additionally, be happy to observe us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




    Source link

    Naveed Ahmad

    Related Posts

    Perplexity’s Retreat From Advertisements Indicators a Greater Strategic Shift

    20/02/2026

    Cellebrite lower off Serbia citing abuse of its cellphone unlocking instruments. Why not others?

    20/02/2026

    Code Metallic Raises $125 Million to Rewrite the Protection Business’s Code With AI

    20/02/2026
    Leave A Reply Cancel Reply

    Categories
    • AI
    Recent Comments
      Facebook X (Twitter) Instagram Pinterest
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.