Skip to content
Entries

Multimodal Input

Quick answer
A generation flow that combines text, image, video, and audio references in a single prompt so each modality controls a different part of the output.
What it usually affects
fundamentals / input-mode / reference
Reading time
1 min
March 12, 2026
A generation flow that combines text, image, video, and audio references in a single prompt so each modality controls a different part of the output.
Editorial noteGlossary term
Created by

Seedance2Prompt Editorial Team

Reviewed by: Seedance2Prompt Editorial Team

Last reviewed

2026-03-12

Published: 2026-03-12

How this page was made

Glossary entries define terms in the way they are actually used during generation and pair them with nearby concepts people often confuse.

fundamentals
input-mode
reference

Related terms

Suggested next entries based on tags and workflow adjacency.

Image to Video
Public
Mar 12, 2026

Image to Video

A workflow that starts from an image and asks Seedance to extend it into motion while preserving composition or identity cues.

Text to Video
Public
Mar 12, 2026

Text to Video

A generation mode where the model creates a video from text alone, without image or video references.

First and Last Frame
Public
Mar 12, 2026

First and Last Frame

An entry mode that locks the opening and closing image of a generation so motion and timing are predictable from a fixed start and end state.

Seedance Prompt
Public
Mar 12, 2026

Seedance Prompt

The core instruction block that tells Seedance what to show, how it should move, and what visual constraints to keep stable.

Prompt Structure
Public
Mar 12, 2026

Prompt Structure

The ordered way a Seedance prompt arranges subject, action, camera, style, and constraints so the model can apply a clear hierarchy of importance.

Reference Image
Public
Mar 12, 2026

Reference Image

An image supplied to Seedance to preserve subject identity, style, composition, or product details during generation.

Reference Video
Public
Mar 12, 2026

Reference Video

A source clip used to guide motion rhythm, camera behavior, editing energy, or effect language in a new generation.

Camera Movement
Public
Mar 12, 2026

Camera Movement

The way the virtual camera travels, rotates, follows, or reveals information over time in a generated shot.

Character Consistency
Public
Mar 12, 2026

Character Consistency

The ability to keep the same person or character recognizable across frames or multiple generated shots.

Cinematic Lighting
Public
Mar 12, 2026

Cinematic Lighting

Lighting language that shapes mood, depth, emphasis, and premium visual quality in a generated video shot.

Continue browsing Seedance content

A glossary entry is usually a midpoint. The next step is either a prompt example or a guide that shows the term inside a full workflow.