Blog

Gemini Omni Flash: Your Character, Consistent Across Every AI Video Scene

Gemini

Google launches Gemini Omni Flash at I/O 2026: the first model that preserves a character's identity and voice across all your video scenes. Full breakdown.

On May 19, 2026, during the Google I/O keynote at Shoreline Amphitheater, Demis Hassabis takes the stage and drops a sentence that will grab the attention of everyone who creates video with AI: your character keeps its face, voice, and identity from one scene to the next. No re-prompting, no lost coherence. That's precisely what Gemini Omni Flash promises, the first model from the new Gemini Omni family, available today in Google Flow, the Gemini app, and YouTube Shorts.

The missing piece for AI filmmakers

If you've ever tried to produce a short film with a generative video tool, you know the wall. Scene 1, your hero has brown eyes. Scene 2, he's blond with blue eyes. Scene 3, he speaks with a different voice. Until now the only solution was tedious workflows involving persona LoRAs in ComfyUI, or hoping Sora would eventually stabilize its character locks in production.

Gemini Omni Flash attacks this problem directly. According to the official Google blog:

Omni Flash also improves character consistency, meaning identity and voice are preserved across every scene.

Google Blog· Flow Updates, Google LabsGoogle Blog, May 19, 2026

In practice: you create a character once, cast them into as many scenes as you want, and the model maintains their appearance, distinctive traits, and voice consistently. That's the missing piece for narrative AI production, following camera consistency and physics simulation.

Gemini Omni Flash demo, female character seen from behind playing violin in a field of grass and daisies
One of the official character consistency demos published by DeepMind. The same character can be carried from scene to scene, voice included. Source: Gemini Omni product page.

"Create anything from anything": what the model can actually do

The Gemini Omni family adopts a concise tagline on the DeepMind page: "Create anything from anything." Behind that marketing slogan, the documented capabilities are concrete.

4
input types (image, text, video, audio)
1
video output (image and text announced soon)
140+
countries with Google Flow access
SynthID + C2PA
watermark on every output

Here are the capabilities officially demonstrated at I/O 2026, drawn from the DeepMind product page:

  • Multi-turn conversational video editing (you refine by talking, not by rewriting the prompt from scratch)
  • Motion and style transfer from reference material
  • Character or object swap via natural language
  • Camera angle adjustments
  • Sketch-to-realistic-video generation
  • Stop motion and claymation
  • Pose transfer and drawing-guided motion capture
  • Character transformation with dialogue preservation
  • Text synchronization with on-screen action (historically very hard for video models)

On that last point, Demis Hassabis emphasized during the keynote: Gemini Omni integrates "Gemini's reasoning powers with a better grasp of physics concepts such as kinetic energy and gravity."

With world models, AI is moving from predicting text to simulating reality.

~Sundar Pichai, opening keynote I/O 2026

That's the paradigm shift Google has been framing for months: a model that no longer predicts text, but models the physical world. The technical name for this framing is "world model," and Gemini Omni is their first consumer-facing implementation of the concept.

6 official prompts to understand the full scope

Google published six demonstration prompts on the DeepMind page. They communicate the model's input range better than any description:

Visual physics effect

"When the person touches the mirror, make the mirror ripple beautifully like liquid, and the person's arm turns into reflective mirror material" - physics simulation + real-time object transformation

Kinetic simulation

"A marble rolling fast on a chain reaction style track, continuous smooth shot" - applied physics + camera constraint (continuous shot)

Scientific stop motion

"Claymation explainer of protein folding, everything is made out of clay, no hands, stop motion, accurate" - style + scientific accuracy constraint simultaneously

Explainer with voiceover

"A skeuomorphism stop motion explainer about how the brain hippocampus works with a compelling voiceover" - audio + image sync + artistic style

Motion transfer

"Apply the pose and motion from input video to provided character from this image. Apply style from image reference to the new video" - combined multimodal input (video + image)

Text rendering

"word by word, one word on the screen at a time: did, you, know, that, this, model, can, do, pretty, good, text!?" - solving a historically hard problem for video models

These six examples deliberately cover the full spectrum: from visually spectacular effects to educational use cases, passing through motion capture and scientific explainers. Google is showing that Omni Flash isn't a niche cinema model, but a versatile production tool.

Gemini Omni Flash demo, glowing fireflies flying around a stylized turquoise fern in stop motion claymation
Generative stop motion claymation: fireflies around a fern, one of the most polished DeepMind examples for the model. Source: Gemini Omni product page.
Gemini Omni Flash demo, scientific claymation illustrating amino acids as a chain of colorful beads on a blue background
Scientific claymation explainer: Omni Flash combines stop motion style with factual accuracy in the same prompt. Source: Gemini Omni product page.

Distribution: the Flow + Gemini + YouTube Shorts trio

Omni Flash's distribution strategy is as significant as the model itself. Google chose to launch simultaneously across three surfaces:

Google Flow: the creative video studio launched at I/O 2025, now available in 140+ countries. Flow now integrates a dedicated AI agent and Custom Tools that any creator can build in natural language (and share or remix). One already-public example: pixelBento by Laszlo Gaal, which generates lo-fi and glitch effects without code.

Gemini app: the consumer-facing interface, with Omni Flash access for AI Pro and Ultra subscribers. Mobile Flow apps (Android in beta, iOS "coming soon") are also announced.

YouTube Shorts: direct integration for creators who publish short-form content.

The price drop warrants a critical look. At 250/month,GoogleAIUltrawasapremiumproductforstudios.At250/month, Google AI Ultra was a premium product for studios. At 100/month, it becomes competitive with Runway Pro and potentially with mid-tier creator subscriptions. For creators producing video content regularly, the math changes.

Safety and traceability: SynthID + C2PA

Every Omni Flash output is invisibly watermarked via SynthID (DeepMind technology) and carries C2PA (Content Credentials) metadata, the open standard for digital content provenance.

That's concrete progress in a domain that remains a blind spot for most AI video tools: traceability. Knowing whether a video was AI-generated, and by whom, is becoming a regulatory requirement in several countries (see ongoing discussions in the European Parliament under the AI Act). Google is getting ahead of this by embedding these metadata from day one.

What we don't know yet

The DeepMind page is deliberately marketing-forward. Several technical details remain unpublished and it would be inaccurate to invent them:

  • Model size, parameter count, latency
  • Public benchmarks against Sora 2, Veo 3, or Runway Gen-4
  • Maximum duration of generated videos
  • Exact technical mechanism behind character consistency (LoRA-like? persistent latent embedding? something else?)
  • Timeline for the image and text outputs promised for the Omni family

These gaps are normal for a day-one announcement. Third-party benchmarks typically arrive in the weeks following a launch. For now, the keynote demos and six official prompts are the only verifiable performance indicators.

Frequently asked questions

  • Is Gemini Omni Flash available without a paid subscription?

    No. Access to Gemini Omni Flash requires a Google AI Pro or Google AI Ultra subscription. AI Ultra dropped from 250to250 to 100/month at I/O 2026. No free access or limited tier has been announced at this stage.

  • What's the difference between Gemini Omni Flash and Veo 3?

    Veo 3 was already in Google Flow before I/O 2026, focused on high-quality video generation. Gemini Omni Flash brings multimodal input (combining image, text, audio, and video), cross-scene character consistency, and multi-turn conversational editing. Both models coexist in Flow for now. Google has not published a convergence roadmap.

  • Does character consistency work with real faces?

    Google has not specified the limits on real faces in its public documentation. SynthID and C2PA protections apply to all outputs. The European AI Act and Google's content policies also apply.

  • Is Google Flow available in the US?

    Yes. Google Flow is available in 140+ countries since launch. The Flow page lists no geographic exception for the United States or the European Union as of May 19, 2026.

Going further

The full Google I/O 2026 keynote is the primary reference for verifying quotes from Pichai and Hassabis. The Gemini Omni section starts around the 35-minute mark:

The Google I/O 2026 keynote. Demis Hassabis introduces Gemini Omni Flash around the 35-minute mark. Source: official Google channel.

Primary sources used for this breakdown:

New agents, mobile apps and Gemini Omni for Google Flow
The official Google Labs announcement on Flow updates: the most detailed source on Omni Flash features, the Flow Agent, and Custom Tools.
blog.google
Gemini Omni - Google DeepMind
The official DeepMind product page with six demonstration prompts and documented model capabilities, including SynthID + C2PA.
deepmind.google
Google I/O 2026: Sundar Pichai's opening keynote
Pichai's full strategic vision on world models and the shift 'from predicting text to simulating reality,' with context on all I/O 2026 announcements.
blog.google
Google I/O 2026 Keynote - Live blog (9to5Google)
The most complete live coverage with verified numbers on AI Ultra pricing ($250 to $100/month) and verbatim quotes from Hassabis during the keynote.
9to5google.com

If AI video generation interests you in a broader AI landscape context, we've broken down the latest competing announcements: Grok Build vs Claude Code: the May 2026 comparison and Meta's layoffs around its AI division give a panorama of the forces at play.

Want to integrate AI video into your creative strategy? Let's talk.