Seedance 2.0 API — ByteDance Text-to-Video, Image-to-Video & Reference-to-Video (fal.ai)

The complete, open-source guide to the Seedance 2.0 API by ByteDance on fal.ai — with working Python, Node.js (JavaScript/TypeScript), and cURL examples for every endpoint, pricing, schemas, prompt tips, and FAQs.

Seedance 2.0 is ByteDance's state-of-the-art video generation model available via the fal.ai API. It produces cinematic 720p video up to 15 seconds with native synchronized audio, lip-sync speech, multi-shot editing, real-world physics, and director-level camera control from text, image, or reference inputs.

This repo collects every Seedance 2.0 endpoint (standard and fast tiers), fully documented with copy-paste examples — the fastest way to start building with the Seedance 2.0 API today.

What is Seedance 2.0?

Seedance 2.0 (by ByteDance, served on fal.ai) is a next-generation AI video generation model. Compared to first-generation text-to-video models, Seedance 2.0 delivers:

Native synchronized audio — sound effects, ambient audio, and lip-synced speech generated together with video (not post-hoc dubbed).
Multi-shot editing — a single prompt can produce multiple camera cuts within the same clip.
Director-level camera control — pans, dollies, crane shots, dutch angles, and more, driven by natural language.
Real-world physics — cloth, fluids, lighting, and object interactions behave plausibly.
Multimodal inputs — combine text prompts with up to 9 reference images, 3 reference videos, and 3 audio clips (reference-to-video).
Up to 15-second clips at 480p or 720p, in aspect ratios from 9:16 (vertical/Reels/Shorts/TikTok) to 21:9 (ultrawide cinematic).

Two tiers are available on fal.ai: the standard tier (highest quality) and the fast tier (lower latency and cost, same API surface).

All Seedance 2.0 API Endpoints

#	Endpoint	Model ID	Input	Audio	Playground
1	Text to Video	`bytedance/seedance-2.0/text-to-video`	text prompt	yes	Open
2	Image to Video	`bytedance/seedance-2.0/image-to-video`	text + start (and optional end) image	yes	Open
3	Reference to Video	`bytedance/seedance-2.0/reference-to-video`	text + up to 9 images / 3 videos / 3 audio	yes	Open
4	Fast Text to Video	`bytedance/seedance-2.0/fast/text-to-video`	text prompt	yes	Open
5	Fast Image to Video	`bytedance/seedance-2.0/fast/image-to-video`	text + start (and optional end) image	yes	Open
6	Fast Reference to Video	`bytedance/seedance-2.0/fast/reference-to-video`	text + up to 9 images / 3 videos / 3 audio	yes	Open

Base HTTP endpoint: https://fal.run/<model-id> — e.g. https://fal.run/bytedance/seedance-2.0/fast/text-to-video.

Seedance 2.0 API Pricing

Pricing is per second of generated video at 720p on fal.ai. Cost does not change whether you generate audio or not.

Endpoint	720p price / sec	Token price (per 1K tokens)
Text to Video	$0.3034	$0.0140
Image to Video	$0.3024	$0.0140
Reference to Video (images only)	$0.3024	$0.0140
Reference to Video (with video inputs)	$0.1814	$0.0140
Fast — Text to Video	$0.2419	$0.0112
Fast — Image to Video	$0.2419	$0.0112
Fast — Reference to Video (images only)	$0.2419	$0.0112
Fast — Reference to Video (with video inputs)	$0.14515	$0.0112

Tokens are computed as (height * width * duration * 24) / 1024. For reference-to-video, input + output duration both count. See the official fal.ai pricing page for the latest numbers.

Quick Start

1. Get a fal.ai API key

Sign up at fal.ai, create a key at fal.ai/dashboard/keys, and export it:

export FAL_KEY="your_fal_api_key_here"

Or copy .env.example to .env and fill in the value.

Python Quick Start

pip install -r requirements.txt
python examples/python/text_to_video.py

import fal_client

result = fal_client.subscribe(
    "bytedance/seedance-2.0/fast/text-to-video",
    arguments={
        "prompt": "A cinematic shot of a hummingbird drinking from a neon flower at dusk, 4k, slow motion.",
        "resolution": "720p",
        "duration": "6",
        "aspect_ratio": "16:9",
        "generate_audio": True,
    },
    with_logs=True,
)
print(result["video"]["url"])

JavaScript / Node.js Quick Start

cd examples/javascript
npm install
node text_to_video.mjs

import { fal } from "@fal-ai/client";

const result = await fal.subscribe("bytedance/seedance-2.0/fast/text-to-video", {
  input: {
    prompt: "A cinematic shot of a hummingbird drinking from a neon flower at dusk, 4k, slow motion.",
    resolution: "720p",
    duration: "6",
    aspect_ratio: "16:9",
    generate_audio: true,
  },
  logs: true,
});
console.log(result.data.video.url);

cURL Quick Start

curl --request POST \
  --url https://fal.run/bytedance/seedance-2.0/fast/text-to-video \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "A cinematic shot of a hummingbird drinking from a neon flower at dusk, 4k, slow motion.",
    "resolution": "720p",
    "duration": "6",
    "aspect_ratio": "16:9",
    "generate_audio": true
  }'

Full runnable scripts live in examples/.

Endpoint Reference

1. Text to Video

Generate a video from a pure text prompt.

Model IDs

Standard: bytedance/seedance-2.0/text-to-video
Fast: bytedance/seedance-2.0/fast/text-to-video

Input

Field	Type	Required	Default	Notes
`prompt`	string	yes	—	Your text prompt.
`resolution`	enum	no	`720p`	`480p` or `720p`.
`duration`	enum	no	`auto`	`auto` or `"4"`–`"15"` (seconds).
`aspect_ratio`	enum	no	`auto`	`21:9`, `16:9`, `4:3`, `1:1`, `3:4`, `9:16`, `auto`.
`generate_audio`	bool	no	`true`	Native SFX/ambient/speech audio.
`seed`	int	no	—	For reproducibility.
`end_user_id`	string	no	—	Attribute usage to an end-user.

Examples: examples/python/text_to_video.py, examples/javascript/text_to_video.mjs, examples/curl/text_to_video.sh.

2. Image to Video

Animate a still image into a cinematic clip. Supports optional end-frame control for clean transitions.

Model IDs

Standard: bytedance/seedance-2.0/image-to-video
Fast: bytedance/seedance-2.0/fast/image-to-video

Additional Input

Field	Type	Required	Notes
`image_url`	string	yes	Starting frame. JPEG/PNG/WebP, ≤ 30 MB.
`end_image_url`	string	no	Optional ending frame for smooth transitions.

All other parameters match text-to-video. Examples: examples/python/image_to_video.py, examples/javascript/image_to_video.mjs, examples/curl/image_to_video.sh.

3. Reference to Video

The most powerful mode. Provide up to 9 images, 3 videos, and 3 audio clips as references and mention them in the prompt as @Image1, @Video1, @Audio1, etc.

Model IDs

Standard: bytedance/seedance-2.0/reference-to-video
Fast: bytedance/seedance-2.0/fast/reference-to-video

Additional Input

Field	Type	Required	Notes
`image_urls`	string[]	no	Up to 9. JPEG/PNG/WebP, ≤ 30 MB each.
`video_urls`	string[]	no	Up to 3. MP4/MOV, combined 2–15 s, ≤ 50 MB, ~480p–720p.
`audio_urls`	string[]	no	Up to 3. MP3/WAV, combined ≤ 15 s, ≤ 15 MB each. Requires at least one reference image or video.

Total files across all modalities must not exceed 12. Examples: examples/python/reference_to_video.py, examples/javascript/reference_to_video.mjs, examples/curl/reference_to_video.sh.

4. Fast Tier Endpoints

The fast tier has an identical API surface to the standard tier — just swap the model ID. Use it when you want lower latency and cost and can accept a small quality tradeoff.

Instead of...	Use...
`bytedance/seedance-2.0/text-to-video`	`bytedance/seedance-2.0/fast/text-to-video`
`bytedance/seedance-2.0/image-to-video`	`bytedance/seedance-2.0/fast/image-to-video`
`bytedance/seedance-2.0/reference-to-video`	`bytedance/seedance-2.0/fast/reference-to-video`

Common Parameters

These parameters work identically across every Seedance 2.0 endpoint:

resolution — 480p (faster, cheaper) or 720p (default, balanced).
duration — "auto" or an integer-as-string from "4" to "15" seconds.
aspect_ratio — 21:9, 16:9, 4:3, 1:1, 3:4, 9:16, or auto. Use 9:16 for TikTok/Reels/Shorts, 16:9 for YouTube/landscape, 21:9 for cinematic.
generate_audio — true by default. Keeps audio free — same cost whether on or off.
seed — integer for reproducibility.
end_user_id — optional string to attribute usage per end user.

Output is the same for every endpoint:

{
  "video": { "url": "https://.../output.mp4" },
  "seed": 42
}

Prompting Guide for Seedance 2.0

Seedance 2.0 responds very well to cinematographic language. Use this template:

[Shot type] of [subject] [action] in [environment], [lighting], [mood/style], [camera movement], [audio cue].

Examples

Wide aerial tracking shot of a lone cyclist riding through a neon-lit Tokyo alley at night, volumetric rain, cinematic lighting, slow push-in, rain and distant thunder.
Close-up of a chef flambéing a pan in a rustic kitchen, golden hour through the window, shallow depth of field, handheld camera, sizzling oil and jazz music.
Multi-shot: establishing drone shot of an alpine lake at sunrise, cut to a hiker sipping coffee, cut to a hawk taking flight. Ambient birdsong and wind.

Tips

Be specific about motion. "Slow dolly-in", "whip pan", "crane rise", "handheld follow" — Seedance 2.0 understands them.
Mention audio explicitly if you want specific SFX ("footsteps on gravel", "distant thunder", "crowd cheering").
For lip-sync, add the exact line of dialogue in quotes and describe the speaker's emotion.
Reference inputs: in reference-to-video, refer to references as @Image1, @Video2, @Audio1 inside the prompt to direct how they're used.
Aspect ratio matters for composition. Vertical (9:16) naturally favours portraits; 21:9 favours landscapes and establishing shots.

FAQ

What is the Seedance 2.0 API?

The Seedance 2.0 API is ByteDance's video generation API exposed on fal.ai. It generates up to 15-second 720p video with native synchronized audio from text prompts, still images, or multimodal references.

How do I access Seedance 2.0?

Create a fal.ai account, grab an API key, and call any of the six Seedance 2.0 endpoints listed above via HTTP, the Python client (fal-client), or the JavaScript client (@fal-ai/client).

What's the difference between standard and fast tier?

Both tiers share the same parameters and features. The fast tier trades a small amount of quality for lower latency and about 20–40% lower cost.

Does Seedance 2.0 generate audio?

Yes — natively. generate_audio defaults to true and covers sound effects, ambient audio, and lip-synced speech. Audio generation is free (same cost whether on or off).

What resolutions and durations are supported?

480p or 720p, and 4 to 15 seconds (or auto to let the model pick).

What aspect ratios does Seedance 2.0 support?

21:9, 16:9, 4:3, 1:1, 3:4, 9:16, or auto.

How much does the Seedance 2.0 API cost?

From $0.14515/sec (fast, reference-to-video with video inputs) up to $0.3034/sec (standard text-to-video). See the pricing table.

Can I use Seedance 2.0 commercially?

Yes — fal.ai lists Seedance 2.0 as commercial-use. Always re-check the current terms on the model page.

Is there a webhook / queue API?

Yes. Use fal.queue.submit / fal.queue.status / fal.queue.result (JS) or the equivalent Python queue helpers for long-running jobs and webhooks. See docs/queue.md.

How does reference-to-video work?

Upload images, videos, and/or audio, then refer to them by name (@Image1, @Video1, @Audio1) inside your prompt. Seedance 2.0 will condition the generation on them.

Deep-dive docs

Resources

Contributing

PRs welcome! Useful contributions:

More prompt recipes (cinematic, product ads, anime, lip-sync dialogues).
Examples in additional languages (Go, Rust, Ruby, PHP, Swift).
Benchmarks comparing standard vs. fast tier.
Integrations (Next.js, FastAPI, Cloudflare Workers).

Open an issue or pull request.

License

MIT — free to use, modify, and distribute. This repo is an unofficial community guide to the Seedance 2.0 API on fal.ai; it is not affiliated with ByteDance or fal.ai.

Keywords: seedance 2.0, seedance 2.0 api, seedance 2 api, bytedance seedance, bytedance video generation, fal.ai seedance, fal seedance 2.0, seedance text to video, seedance image to video, seedance reference to video, seedance fast api, AI video generation API, text to video API, image to video API, lip sync video API, synchronized audio video AI, 720p AI video generation, 15 second AI video, cinematic AI video API.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
examples		examples
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Seedance 2.0 API — ByteDance Text-to-Video, Image-to-Video & Reference-to-Video (fal.ai)

Table of Contents

What is Seedance 2.0?

All Seedance 2.0 API Endpoints

Seedance 2.0 API Pricing

Quick Start

1. Get a fal.ai API key

Python Quick Start

JavaScript / Node.js Quick Start

cURL Quick Start

Endpoint Reference

1. Text to Video

2. Image to Video

3. Reference to Video

4. Fast Tier Endpoints

Common Parameters

Prompting Guide for Seedance 2.0

FAQ

What is the Seedance 2.0 API?

How do I access Seedance 2.0?

What's the difference between standard and fast tier?

Does Seedance 2.0 generate audio?

What resolutions and durations are supported?

What aspect ratios does Seedance 2.0 support?

How much does the Seedance 2.0 API cost?

Can I use Seedance 2.0 commercially?

Is there a webhook / queue API?

How does reference-to-video work?

Deep-dive docs

Resources

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages