Gemini 3.5 Flash now available on your computer: Google’s AI takes control of your browser, apps, and desktop

Google has fully integrated computer use into the gemini 3.5 flash. This allows models to interact directly with browsers, mobile interfaces, and desktop applications. This means you can click, tap, scroll, and perform multi-step tasks. Previously, this feature was included in a separate model. Starting June 24, 2026, it will run natively within the standard Flash model. For developers, this means a model that allows them to see, think, and act all at once.

Summary of key points

Starting June 24, 2026, Computer Use will be an integral part of Gemini 3.5 Flash and will no longer be a separate model.
AI controls browsers, smartphone interfaces, and desktop applications through screenshots and simulated input.
In the OSWorld Verified Benchmark, this model achieves 78.4%, which is almost on par with GPT-5.5.
Two optional security systems are designed to thwart prompt injection attacks and protect critical actions.
It is available through the Gemini API and Gemini Enterprise Agent Platform, as well as demo environments and reference code.

What your computer can actually do with Gemini 3.5 Flash

Until now, Computer Use at Google was a special model based on Gemini 2.5. Although I could interact with the user interface, I couldn’t access other tools like Google Search or Map Grounding at the same time. This very separation has now been eliminated. In Gemini 3.5 Flash, screen controls are one of the built-in tools, along with function calls and the familiar search and map integration.

In practice, it works like this: The model receives a screenshot of the current interface, recognizes buttons, text fields, menus, and decides what to do next. Click buttons, fill out forms, switch tabs, and enter text. Google cites a functional analysis of its Gemini app and a self-audit of its accessibility documentation as examples. Therefore, this model covers three environments: web browsers, mobile operating systems, and traditional desktop software.

The real charm lies in the boring work. Continuous software testing, clicking through multiple business applications, and knowledge work across different tools – tasks that involve many steps and previously required significant manual effort.

Using Gemini computers in benchmarks: doing well but not at the top

The big question, of course, is how well the whole thing actually works. The benchmark used is OSWorld-Verified, which tests Computer Use agents across Ubuntu, Windows, and macOS. Gemini 3.5 Flash had a score of 78.4% there. By comparison, its predecessor, the Gemini 3 Flash, scored 65.1%, an increase of more than 13 points from generation to generation.

model	OSWorld certified
Claude Op. 4.8	83.4%
GPT-5.5	78.7%
gemini 3.5 flash	78.4%
gemini 3.1 pro	76.2%
gemini 3 flash	65.1%

Numbers verified by OSWorld are based on manufacturer specifications. As of June 2026.

There are two things to note. First, all numbers on the OSWorld Verified leaderboard are self-reported by the vendor and will not be independently verified until June 2026. Benchmarks can help you get a general idea, but be careful about comparing directly to the decimal point. Second, the difference between Flash and GPT-5.5 is only 0.3 points. The real difference is in the price. Gemini 3.5 Flash costs $1.50 per million input tokens and $9 per million output tokens, while GPT-5.5 costs $5 and $30, respectively. If the agent has a large workload, that amount can quickly increase.

While the raw numbers look appealing, in computational use there is a larger gap between benchmarks and production environments than in most other AI tasks. OSWorld measures predefined tasks in a stable environment. Real-world agents, on the other hand, work in applications that are constantly changing, require logins, and display screen states that the model has never seen before. Google itself advises against using computers for important decisions, sensitive data, or situations where errors cannot be corrected.

Security: How does Google plan to manage risk?

An AI that navigates browsers, forms, and file systems on its own has a completely different scope than a text-only chatbot. If power user privileges are granted, that functionality itself becomes a vulnerability. Therefore, Google relies on targeted adversarial training in Gemini 3.5 Flash to reduce the risk of instant injection in live environments.

In addition, there are two optional protection systems for businesses. One requires explicit user confirmation before performing sensitive or irreversible actions. The other automatically stops the task as soon as it detects an indirect prompt injection. Google also recommends a defense-in-depth approach using secure sandboxes, human involvement, and strict permissions. Google provides more details in its best practices documentation.

How computer use fits into the Gemini strategy

This integration is not a one-time step and is in line with Google’s approach over the past few months. Gemini 3.5 Flash was designed from the beginning as an agent-based model, and since its announcement at Google I/O 2026, it has powered features such as the persistent agent Gemini Spark. For more information about the model itself and its agent-based features, see the article Introducing Gemini 3.5 Flash.

This approach is also evident in everyday use. On Android, Gemini Intelligence is increasingly taking on proactive tasks and automating workflows across multiple apps. And within the Gemini app itself, Google has been shifting its focus from a pure chatbot to an active assistant for several months. 3.5 The use of computers in Flash is essentially the technical foundation upon which many of these promises are built.

Availability: How to get started

If you want to try using a computer, you have several options. Developers and enterprises can access this functionality through the Gemini API and Gemini Enterprise Agent Platform. For quick testing, there is a demo environment hosted by Browserbase, and to help you get started, Google provides a reference implementation on GitHub.

conclusion

With the use of computers in Gemini 3.5 Flash, Google is making the leap from pure assistance to execution. The model of being able to simultaneously use Google search, call your own functions, and operate your browser on the side is a real game-changer for automation and enterprise workflows. Its cost advantage over GPT-5.5 makes it particularly attractive when dealing with many parallel agents.

At the same time, a calm evaluation is also necessary. Benchmark numbers are self-reported, and the transition from a test environment to an actual production environment is especially delicate in computer usage. The fact that Google itself recommends sandboxing, human oversight, and caution when handling critical tasks should be taken seriously. Nevertheless, it remains exciting as the next logical step, Gemini 3.5 Pro, is already in the starting stages.

Frequently asked questions about using computers with Gemini 3.5 flash

What is “computer use” for Gemini 3.5 Flash?
This is a built-in tool that allows models to interact with browsers, apps, and desktop programs on their own. Analyze user interface screenshots and perform actions such as clicks, taps, and scrolls.

How does Gemini 3.5 Flash perform in the “Computer Use” test?
In the OSWorld Verified Benchmark, this model achieved 78.4 percent, almost on par with GPT-5.5 (78.7 percent). According to the reported values, Claude Opus 4.8 takes the lead with 83.4%.

Is it safe to use a computer?
Google uses adversarial training and offers two optional protection systems: critical action confirmation requirements and automatic termination if prompt injection is detected. Google advises against using it for sensitive or non-recoverable tasks.

Source link

Binance推荐代码 commented on Tell Us Your Thoughts on Saw X and The Creator: I don't think the title of your article matches th
binance Registrera dig commented on New Podcast Exploring A.I. and Business Travel: Thank you for your sharing. I am worried that I la
注册以获取100 USDT commented on Two divergent skills that matter in an AI world: Math and business development: Can you be more specific about the content of your
Linda Espey commented on Revolutionizing safety and seamless journeys: This was a fantastic and informative article! I re
skapa ett binance-konto commented on The humor of French slang: Thank you for your sharing. I am worried that I la

Gemini 3.5 Flash now available on your computer: Google’s AI takes control of your browser, apps, and desktop

Summary of key points

What your computer can actually do with Gemini 3.5 Flash

Using Gemini computers in benchmarks: doing well but not at the top

Security: How does Google plan to manage risk?

How computer use fits into the Gemini strategy

Availability: How to get started

conclusion

Frequently asked questions about using computers with Gemini 3.5 flash

RECENT POSTS

Darwin AI expands into enterprise-grade platform for statewide, multi-agency AI governance

Where traditional observability stops in AI-enabled applications

Viral video of drone show at Jagannath temple was generated by AI

Summary of key points

What your computer can actually do with Gemini 3.5 Flash

Using Gemini computers in benchmarks: doing well but not at the top

Security: How does Google plan to manage risk?

How computer use fits into the Gemini strategy

Availability: How to get started

conclusion

Frequently asked questions about using computers with Gemini 3.5 flash

Related Posts