Skip to content

amrrs/seedance-2.0-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Seedance 2.0 API — ByteDance Text-to-Video, Image-to-Video & Reference-to-Video (fal.ai)

The complete, open-source guide to the Seedance 2.0 API by ByteDance on fal.ai — with working Python, Node.js (JavaScript/TypeScript), and cURL examples for every endpoint, pricing, schemas, prompt tips, and FAQs.

Seedance 2.0 fal.ai License: MIT Python Node.js PRs Welcome

Seedance 2.0 is ByteDance's state-of-the-art video generation model available via the fal.ai API. It produces cinematic 720p video up to 15 seconds with native synchronized audio, lip-sync speech, multi-shot editing, real-world physics, and director-level camera control from text, image, or reference inputs.

This repo collects every Seedance 2.0 endpoint (standard and fast tiers), fully documented with copy-paste examples — the fastest way to start building with the Seedance 2.0 API today.


Table of Contents


What is Seedance 2.0?

Seedance 2.0 (by ByteDance, served on fal.ai) is a next-generation AI video generation model. Compared to first-generation text-to-video models, Seedance 2.0 delivers:

  • Native synchronized audio — sound effects, ambient audio, and lip-synced speech generated together with video (not post-hoc dubbed).
  • Multi-shot editing — a single prompt can produce multiple camera cuts within the same clip.
  • Director-level camera control — pans, dollies, crane shots, dutch angles, and more, driven by natural language.
  • Real-world physics — cloth, fluids, lighting, and object interactions behave plausibly.
  • Multimodal inputs — combine text prompts with up to 9 reference images, 3 reference videos, and 3 audio clips (reference-to-video).
  • Up to 15-second clips at 480p or 720p, in aspect ratios from 9:16 (vertical/Reels/Shorts/TikTok) to 21:9 (ultrawide cinematic).

Two tiers are available on fal.ai: the standard tier (highest quality) and the fast tier (lower latency and cost, same API surface).


All Seedance 2.0 API Endpoints

# Endpoint Model ID Input Audio Playground
1 Text to Video bytedance/seedance-2.0/text-to-video text prompt yes Open
2 Image to Video bytedance/seedance-2.0/image-to-video text + start (and optional end) image yes Open
3 Reference to Video bytedance/seedance-2.0/reference-to-video text + up to 9 images / 3 videos / 3 audio yes Open
4 Fast Text to Video bytedance/seedance-2.0/fast/text-to-video text prompt yes Open
5 Fast Image to Video bytedance/seedance-2.0/fast/image-to-video text + start (and optional end) image yes Open
6 Fast Reference to Video bytedance/seedance-2.0/fast/reference-to-video text + up to 9 images / 3 videos / 3 audio yes Open

Base HTTP endpoint: https://fal.run/<model-id> — e.g. https://fal.run/bytedance/seedance-2.0/fast/text-to-video.


Seedance 2.0 API Pricing

Pricing is per second of generated video at 720p on fal.ai. Cost does not change whether you generate audio or not.

Endpoint 720p price / sec Token price (per 1K tokens)
Text to Video $0.3034 $0.0140
Image to Video $0.3024 $0.0140
Reference to Video (images only) $0.3024 $0.0140
Reference to Video (with video inputs) $0.1814 $0.0140
Fast — Text to Video $0.2419 $0.0112
Fast — Image to Video $0.2419 $0.0112
Fast — Reference to Video (images only) $0.2419 $0.0112
Fast — Reference to Video (with video inputs) $0.14515 $0.0112

Tokens are computed as (height * width * duration * 24) / 1024. For reference-to-video, input + output duration both count. See the official fal.ai pricing page for the latest numbers.


Quick Start

1. Get a fal.ai API key

Sign up at fal.ai, create a key at fal.ai/dashboard/keys, and export it:

export FAL_KEY="your_fal_api_key_here"

Or copy .env.example to .env and fill in the value.

Python Quick Start

pip install -r requirements.txt
python examples/python/text_to_video.py
import fal_client

result = fal_client.subscribe(
    "bytedance/seedance-2.0/fast/text-to-video",
    arguments={
        "prompt": "A cinematic shot of a hummingbird drinking from a neon flower at dusk, 4k, slow motion.",
        "resolution": "720p",
        "duration": "6",
        "aspect_ratio": "16:9",
        "generate_audio": True,
    },
    with_logs=True,
)
print(result["video"]["url"])

JavaScript / Node.js Quick Start

cd examples/javascript
npm install
node text_to_video.mjs
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("bytedance/seedance-2.0/fast/text-to-video", {
  input: {
    prompt: "A cinematic shot of a hummingbird drinking from a neon flower at dusk, 4k, slow motion.",
    resolution: "720p",
    duration: "6",
    aspect_ratio: "16:9",
    generate_audio: true,
  },
  logs: true,
});
console.log(result.data.video.url);

cURL Quick Start

curl --request POST \
  --url https://fal.run/bytedance/seedance-2.0/fast/text-to-video \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "A cinematic shot of a hummingbird drinking from a neon flower at dusk, 4k, slow motion.",
    "resolution": "720p",
    "duration": "6",
    "aspect_ratio": "16:9",
    "generate_audio": true
  }'

Full runnable scripts live in examples/.


Endpoint Reference

1. Text to Video

Generate a video from a pure text prompt.

Model IDs

  • Standard: bytedance/seedance-2.0/text-to-video
  • Fast: bytedance/seedance-2.0/fast/text-to-video

Input

Field Type Required Default Notes
prompt string yes Your text prompt.
resolution enum no 720p 480p or 720p.
duration enum no auto auto or "4""15" (seconds).
aspect_ratio enum no auto 21:9, 16:9, 4:3, 1:1, 3:4, 9:16, auto.
generate_audio bool no true Native SFX/ambient/speech audio.
seed int no For reproducibility.
end_user_id string no Attribute usage to an end-user.

Examples: examples/python/text_to_video.py, examples/javascript/text_to_video.mjs, examples/curl/text_to_video.sh.

2. Image to Video

Animate a still image into a cinematic clip. Supports optional end-frame control for clean transitions.

Model IDs

  • Standard: bytedance/seedance-2.0/image-to-video
  • Fast: bytedance/seedance-2.0/fast/image-to-video

Additional Input

Field Type Required Notes
image_url string yes Starting frame. JPEG/PNG/WebP, ≤ 30 MB.
end_image_url string no Optional ending frame for smooth transitions.

All other parameters match text-to-video. Examples: examples/python/image_to_video.py, examples/javascript/image_to_video.mjs, examples/curl/image_to_video.sh.

3. Reference to Video

The most powerful mode. Provide up to 9 images, 3 videos, and 3 audio clips as references and mention them in the prompt as @Image1, @Video1, @Audio1, etc.

Model IDs

  • Standard: bytedance/seedance-2.0/reference-to-video
  • Fast: bytedance/seedance-2.0/fast/reference-to-video

Additional Input

Field Type Required Notes
image_urls string[] no Up to 9. JPEG/PNG/WebP, ≤ 30 MB each.
video_urls string[] no Up to 3. MP4/MOV, combined 2–15 s, ≤ 50 MB, ~480p–720p.
audio_urls string[] no Up to 3. MP3/WAV, combined ≤ 15 s, ≤ 15 MB each. Requires at least one reference image or video.

Total files across all modalities must not exceed 12. Examples: examples/python/reference_to_video.py, examples/javascript/reference_to_video.mjs, examples/curl/reference_to_video.sh.

4. Fast Tier Endpoints

The fast tier has an identical API surface to the standard tier — just swap the model ID. Use it when you want lower latency and cost and can accept a small quality tradeoff.

Instead of... Use...
bytedance/seedance-2.0/text-to-video bytedance/seedance-2.0/fast/text-to-video
bytedance/seedance-2.0/image-to-video bytedance/seedance-2.0/fast/image-to-video
bytedance/seedance-2.0/reference-to-video bytedance/seedance-2.0/fast/reference-to-video

Common Parameters

These parameters work identically across every Seedance 2.0 endpoint:

  • resolution480p (faster, cheaper) or 720p (default, balanced).
  • duration"auto" or an integer-as-string from "4" to "15" seconds.
  • aspect_ratio21:9, 16:9, 4:3, 1:1, 3:4, 9:16, or auto. Use 9:16 for TikTok/Reels/Shorts, 16:9 for YouTube/landscape, 21:9 for cinematic.
  • generate_audiotrue by default. Keeps audio free — same cost whether on or off.
  • seed — integer for reproducibility.
  • end_user_id — optional string to attribute usage per end user.

Output is the same for every endpoint:

{
  "video": { "url": "https://.../output.mp4" },
  "seed": 42
}

Prompting Guide for Seedance 2.0

Seedance 2.0 responds very well to cinematographic language. Use this template:

[Shot type] of [subject] [action] in [environment], [lighting], [mood/style], [camera movement], [audio cue].

Examples

  • Wide aerial tracking shot of a lone cyclist riding through a neon-lit Tokyo alley at night, volumetric rain, cinematic lighting, slow push-in, rain and distant thunder.
  • Close-up of a chef flambéing a pan in a rustic kitchen, golden hour through the window, shallow depth of field, handheld camera, sizzling oil and jazz music.
  • Multi-shot: establishing drone shot of an alpine lake at sunrise, cut to a hiker sipping coffee, cut to a hawk taking flight. Ambient birdsong and wind.

Tips

  • Be specific about motion. "Slow dolly-in", "whip pan", "crane rise", "handheld follow" — Seedance 2.0 understands them.
  • Mention audio explicitly if you want specific SFX ("footsteps on gravel", "distant thunder", "crowd cheering").
  • For lip-sync, add the exact line of dialogue in quotes and describe the speaker's emotion.
  • Reference inputs: in reference-to-video, refer to references as @Image1, @Video2, @Audio1 inside the prompt to direct how they're used.
  • Aspect ratio matters for composition. Vertical (9:16) naturally favours portraits; 21:9 favours landscapes and establishing shots.

FAQ

What is the Seedance 2.0 API?

The Seedance 2.0 API is ByteDance's video generation API exposed on fal.ai. It generates up to 15-second 720p video with native synchronized audio from text prompts, still images, or multimodal references.

How do I access Seedance 2.0?

Create a fal.ai account, grab an API key, and call any of the six Seedance 2.0 endpoints listed above via HTTP, the Python client (fal-client), or the JavaScript client (@fal-ai/client).

What's the difference between standard and fast tier?

Both tiers share the same parameters and features. The fast tier trades a small amount of quality for lower latency and about 20–40% lower cost.

Does Seedance 2.0 generate audio?

Yes — natively. generate_audio defaults to true and covers sound effects, ambient audio, and lip-synced speech. Audio generation is free (same cost whether on or off).

What resolutions and durations are supported?

480p or 720p, and 4 to 15 seconds (or auto to let the model pick).

What aspect ratios does Seedance 2.0 support?

21:9, 16:9, 4:3, 1:1, 3:4, 9:16, or auto.

How much does the Seedance 2.0 API cost?

From $0.14515/sec (fast, reference-to-video with video inputs) up to $0.3034/sec (standard text-to-video). See the pricing table.

Can I use Seedance 2.0 commercially?

Yes — fal.ai lists Seedance 2.0 as commercial-use. Always re-check the current terms on the model page.

Is there a webhook / queue API?

Yes. Use fal.queue.submit / fal.queue.status / fal.queue.result (JS) or the equivalent Python queue helpers for long-running jobs and webhooks. See docs/queue.md.

How does reference-to-video work?

Upload images, videos, and/or audio, then refer to them by name (@Image1, @Video1, @Audio1) inside your prompt. Seedance 2.0 will condition the generation on them.


Deep-dive docs


Resources


Contributing

PRs welcome! Useful contributions:

  • More prompt recipes (cinematic, product ads, anime, lip-sync dialogues).
  • Examples in additional languages (Go, Rust, Ruby, PHP, Swift).
  • Benchmarks comparing standard vs. fast tier.
  • Integrations (Next.js, FastAPI, Cloudflare Workers).

Open an issue or pull request.


License

MIT — free to use, modify, and distribute. This repo is an unofficial community guide to the Seedance 2.0 API on fal.ai; it is not affiliated with ByteDance or fal.ai.


Keywords: seedance 2.0, seedance 2.0 api, seedance 2 api, bytedance seedance, bytedance video generation, fal.ai seedance, fal seedance 2.0, seedance text to video, seedance image to video, seedance reference to video, seedance fast api, AI video generation API, text to video API, image to video API, lip sync video API, synchronized audio video AI, 720p AI video generation, 15 second AI video, cinematic AI video API.

About

Seedance 2.0 API — ByteDance text-to-video, image-to-video, and reference-to-video on fal.ai with Python, Node.js, and cURL examples.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors