According to CoinWorld, Google's main model, Gemini 3.5 Flash, natively supports PC control, unlocking enterprise-level AI agent automation. This feature is directly integrated into the main model as a built-in tool, eliminating the need for developers to use a dedicated Gemini 2.5 PC to control the model. After integration, developers and enterprise users can control devices via the Gemini API or the Google Cloud Gemini Enterprise AI Agent Platform, simplifying AI agent development architecture. The built-in tool receives screenshots for visual perception and step-by-step reasoning, outputting operation commands such as mouse clicks and keyboard input to automate tasks like software testing and data collection. To prevent the risk of prompt word injection, Google has conducted targeted adversarial training on the model and provides protective measures such as manual verification and task circuit breaking. Currently, browserbase provides an online hosted demo environment, and Google has also open-sourced the reference implementation code on GitHub.
Google's flagship model, Gemini 3.5 Flash, natively supports PC control, unlocking enterprise-level intelligent agent automation.
2026-06-25 03:34:42
Share
Disclaimer: This article is copyrighted by the original author and does not represent MyToken’s views and positions. If you have any questions regarding content or copyright, please contact us.(www.mytokencap.com)contact
About MyToken:https://www.mytokencap.com/en/aboutusArticle Link:https://www.mytokencap.com/en/choicenews/3346432.html
More exciting content is available on
X(https://x.com/MyTokencap)or join the community to learn more:MyToken-English Telegram Group
(https://t.me/mytokenGroup)
X(https://x.com/MyTokencap)or join the community to learn more:MyToken-English Telegram Group
(https://t.me/mytokenGroup)