Stay ahead, master crypto insights
2026-04-16 08:39
View OriginalChainThink report, April 16: According to monitoring by DCG Beating, Google has launched its next-generation text-to-speech model, Gemini 3.1 Flash TTS. The core selling point of this model is that developers can precisely control the AI voice's style, speaking rate, and emotional expression. It is now available via Gemini API, Google AI Studio (Developer Preview), Vertex AI (Enterprise Preview), and Google Vids (Workspace users).
The model’s core control capabilities are powered by “audio tags” — developers can embed natural language instructions directly in the input text to adjust pitch, rhythm, accent, and even switch expressive styles mid-sentence. Google provides a “director’s chair”-style configuration interface in Google AI Studio, featuring scene guidance, character-level parameter tuning, and one-click export across three layers of control.
According to the TTS leaderboard from third-party evaluator Artificial Analysis, Gemini 3.1 Flash TTS tops the rankings with a 1211 Elo score and is listed in the “most compelling quadrant.” The model supports over 70 languages and native multi-character dialogue, with all generated audio embedded with SynthID watermarking for AI content identification. For developers, this model elevates TTS from a mere “text-to-speech” tool into a programmable voice performance engine, enabling consistent voice style reuse across product lines.
Disclaimer: Contains third-party opinions, does not constitute financial advice







This column focuses on the real progress of Agents: technological evolution, application implementat
Tracking on-chain movements of the smart money and institutions
Spotlight on Frontier, trending projects, and breaking events
As the 2026 crypto bear market deepens, exit scams and project blowups are becoming increasingly fre
American Crypto Act – timely interpretations of policies worldwide
Selected potential airdrop opportunities to gain big with small investments
FusnChain