Google Completes Gemini 3 Lineup with Launch of ‘Flash’ Model: High Speed Meets Uncompromised Intelligence
Sharon Yoon Correspondent
sharoncho0219@gmail.com | 2025-12-18 06:00:28
(C) Times of AI
SAN FRANCISCO – On December 17, 2024 (PT), Google officially expanded its next-generation artificial intelligence family by launching Gemini 3 Flash. Positioned as a high-speed, cost-effective "lightweight" model, Gemini 3 Flash is designed to eliminate the long-standing trade-off between latency and reasoning capability, effectively completing the Gemini 3 "triple threat" alongside the flagship Pro and the reasoning-specialized DeepSync (Deep Think) models.
Efficiency Without Sacrifice
For years, AI developers faced a binary choice: utilize massive, expensive models for deep intelligence or settle for faster, cheaper models with significantly lower accuracy. Google claims Gemini 3 Flash breaks this barrier.
“Gemini 3 Flash delivers frontier-level intelligence at lightning speeds,” said Josh Woodward, Vice President at Google Labs. “It is the realization of our goal to provide near-Pro level reasoning without the latency typically associated with large-scale models.”
Benchmark Breakthroughs: Surpassing the ‘Pro’
The most striking aspect of the launch is the model’s performance in specialized benchmarks. Despite being a lightweight model created through a process known as "distillation"—where a smaller model is trained to mimic the behavior of a larger one—Gemini 3 Flash has actually outperformed the Pro model in several key areas:
Coding (SWE-bench Verified): Flash achieved a score of 78%, surpassing Gemini 3 Pro’s 76.2%. This makes it an exceptionally powerful tool for agentic coding and real-time software debugging.
Multimodal Reasoning (MMMU-Pro): It scored 81.2%, narrowly edging out the Pro model’s 81%.
Scientific Knowledge (GPQA Diamond): It maintained a near-frontier level of 90.4%, coming within 1.5% of the Pro model’s 91.9%.
Democratizing Access and Lowering Costs
Google has integrated Gemini 3 Flash as the default model for "AI Mode" in Google Search, allowing users to receive complex, synthesized answers at the speed of a standard search query. The model is available starting today for both free and paid users across the globe.
For the developer ecosystem, the financial implications are significant. The API pricing for Gemini 3 Flash is approximately one-quarter of the Pro model's cost:
Input: $0.50 per 1 million tokens.
Output: $3.00 per 1 million tokens.
This aggressive pricing is intended to support "agentic workflows"—autonomous AI loops that perform long-running tasks like data extraction, video analysis, and multi-step coding—where high token consumption usually makes frontier models cost-prohibitive.
Looking Ahead
With the release of Flash, Google is processing over 1 trillion tokens per day through its API. The Gemini 3 family now offers a specialized tool for every use case: DeepSync for ultimate reasoning, Pro for balanced multimodal tasks, and Flash for high-frequency, real-time applications.
As AI competition intensifies with rivals like OpenAI and Anthropic, Google’s strategy focuses on scale and speed, ensuring that next-generation intelligence is not just a premium luxury but a foundational utility for everyone.
WEEKLY HOT
- 1Restoring the Pulse of the City: Daejeon’s First City Hall to Reopen as ‘Sigong-hoedang’
- 2Seoul City Rings in the Festive Season with Limited-Edition 'Hechi & Soul Friends' Winter Emoticons
- 3Public Supports Tripling Royal Palace Entry Fees After 20-Year Freeze
- 4S. Korean Won Breaches 1,480 per Dollar Amid Foreign Sell-off; Authorities Activate FX Swaps
- 5Hanmi Pharm Files for Domestic Approval of Breakthrough Obesity Treatment 'Efpeglenatide'
- 6Kyungdong Navien to Acquire Smart Home Pioneer Commax for 32.8 Billion Won