nanaban

Image generation from the terminal. Nano Banana (Gemini) and GPT Image. One CLI, one key.

Type a prompt. Get an image. Three seconds, one command, zero browser tabs. nanaban is a CLI for AI image generation that works for humans typing prompts and LLM agents calling --json. It runs on Google's Nano Banana (Gemini) and OpenAI GPT-5 Image — pick whichever, or let nanaban choose based on the keys you have.

Install · Quick Start · Models · Auth · Usage · Agent Mode · Contributing

What It Looks Like

nanaban "cyberpunk tokyo street neon rain" --ar wide nanaban "minimalist single line fox" nanaban "product photo white ceramic mug"

Every image on this page was generated with nanaban. ~3 seconds each, straight from the terminal.

Why This Exists

Most AI image generators make you open a browser, wait in a queue, click through UI, and download manually. That workflow breaks the second you need images inside a script, a CI pipeline, or an agent loop.

nanaban fixes that:

One command -- type your prompt, get a file. No browser, no signup flow, no queue.
Two model families, one CLI -- Nano Banana (Gemini) for the cheap/fast/extended-ratio default; GPT-5 Image for OpenAI's strong text and UI rendering.
Auto-names files -- "a fox in a snowy forest at dawn" becomes fox_snowy_forest_dawn.png. No more image_032_final_v2.png.
Built for scripts -- stdout is always the file path. nanaban "a cat" | xargs open just works.
Built for LLM agents -- --json gives structured output with cost. Plug it into any AI pipeline.
Tiny footprint -- runs TypeScript source directly, no build step.

Install

npm install -g nanaban

Requires Node 18+. That is the only dependency.

From source:

git clone https://github.com/paperfoot/nanaban-cli.git
cd nanaban && npm install && npm link

Quick Start

Pick one path:

Gemini direct (free tier available, fastest for Nano Banana):

# Get a key from https://aistudio.google.com/apikey (~30 seconds)
nanaban auth set AIzaSy...
nanaban "a fox in snow"

OpenRouter (one key for both Nano Banana AND GPT-5 Image):

# Get a key from https://openrouter.ai/keys
export OPENROUTER_API_KEY=sk-or-v1-...
nanaban "a fox in snow"                    # uses Nano Banana
nanaban "a fox in snow" --model gpt5-mini  # uses GPT-5 Image Mini

You only need one key. nanaban detects what's available and routes automatically. Run nanaban auth to see what's reachable.

Models

Id	Family	Best for	Aspect ratios	Sizes	~Cost/img
`nb2` (default)	Gemini Nano Banana 2	Fast, cheap, full ratio range	All + extended (1:4, 4:1, 1:8, 8:1)	0.5K-4K	$0.067
`nb2-pro` (`--pro`)	Gemini Nano Banana Pro	Higher quality detail	Standard 10	1K-4K	$0.136
`gpt5`	OpenAI GPT-5 Image	Strong text/UI rendering	1:1, 2:3, 3:2	1K only	$0.193
`gpt5-mini`	OpenAI GPT-5 Image Mini	Cheaper OpenAI option	1:1, 2:3, 3:2	1K only	$0.041

Costs are typical per-image rates via OpenRouter. Direct Gemini pricing follows Google's published rates.

OpenAI models (gpt5, gpt5-mini) currently ignore non-square aspect ratios — output is always 1024×1024. nanaban accepts the --ar flag for them but the API itself doesn't honor it.

Auth

nanaban detects keys in this order and routes automatically. Any single key is enough.

Key	Reaches	Source
`GEMINI_API_KEY` / `GOOGLE_API_KEY`	nb2, nb2-pro	env var
Stored Gemini key	nb2, nb2-pro	`nanaban auth set <key>`
Gemini OAuth	nb2, nb2-pro	`~/.gemini/oauth_creds.json` + OAuth client creds
`OPENROUTER_API_KEY`	nb2, nb2-pro, gpt5, gpt5-mini	env var
Stored OpenRouter key	All four	`nanaban auth set-openrouter <key>`

When both Gemini direct and OpenRouter are configured, nanaban prefers the direct path (lower latency, no middleman markup). Override with --via openrouter.

Check what's reachable: nanaban auth.

Usage

nanaban "prompt"                          # default: nb2 via best transport
nanaban "prompt" -o sunset.png            # custom filename
nanaban "prompt" --ar wide --size 2k      # 16:9, high resolution
nanaban "prompt" --pro                    # Nano Banana Pro
nanaban "prompt" --model gpt5             # GPT-5 Image (needs OpenRouter)
nanaban "prompt" --model gpt5-mini        # GPT-5 Image Mini
nanaban "prompt" --via openrouter         # force OpenRouter for any model
nanaban "prompt" --neg "blurry, text"     # negative prompt (Gemini only)
nanaban "prompt" -r style.png             # reference image
nanaban edit photo.png "add sunglasses"   # edit existing image

Flags

Flag	What it does	Default
`-o, --output <file>`	Output path	auto from prompt
`--ar <ratio>`	Aspect ratio (see table below)	`1:1`
`--size <size>`	Resolution: `0.5k` `1k` `2k` `4k` (model-dependent)	`1k`
`--pro`	Use Nano Banana Pro (alias for `--model nb2-pro`)	off
`--model <id>`	`nb2`, `nb2-pro`, `gpt5`, `gpt5-mini`	`nb2`
`--via <transport>`	Force `gemini-direct` or `openrouter`	auto
`--neg <text>`	Negative prompt (Gemini only)
`-r, --ref <file>`	Reference image (style/content guidance)
`--open`	Open in default viewer after generating	off
`--json`	Structured JSON output for scripts	off
`--quiet`	Suppress non-essential output	off

Aspect Ratios

14 aspect ratios, from square to extreme panoramic:

Ratio	Shorthand	Good for
`1:1`	`square`	Profile pics, thumbnails
`4:3`		Photos, slides
`3:2`		Classic photo format
`5:4`		Print, posters
`16:9`	`wide`	Hero images, banners, wallpapers
`21:9`	`ultrawide`	Cinematic, ultrawide monitors
`4:1`	`panoramic`	Panoramas, website headers
`8:1`	`banner`	Extreme banners, ribbons
`3:4`		Portrait photos
`2:3`	`portrait`	Book covers, tall posters
`4:5`		Instagram portrait
`9:16`	`tall` / `story`	Phone wallpapers, stories
`1:4`		Tall strips, infographic panels
`1:8`		Extreme vertical banners

Note: 1:4, 4:1, 1:8, 8:1 are NB2-only. NB2 Pro supports the standard 10. GPT-5 Image / Mini support only 1:1, 2:3, 3:2 and the API ignores even those (always returns square). nanaban surfaces capability mismatches as CAPABILITY_UNSUPPORTED errors before any HTTP call.

Reference Images

Pass any image as a style or content reference with -r:

nanaban "portrait of a woman" -r painting_style.png
nanaban "modern living room" -r color_palette.jpg
nanaban "product shot" -r brand_reference.png

The model picks up on the visual language of your reference -- color palette, composition, texture, artistic style -- and applies it to your prompt. Useful for keeping a consistent look across a batch of images, matching brand aesthetics, or steering output toward a specific vibe without writing a 200-word prompt.

Editing Existing Images

nanaban edit photo.png "remove the background"
nanaban edit headshot.png "make it a pencil sketch"
nanaban edit product.png "place on a marble table" --ar wide

Takes a source image and your edit instruction. Same flags apply -- pick a model, change aspect ratio, resolution, or use Pro for finer edits.

For LLM Agents and Scripts

--json gives machine-readable output. No spinners, no colors, no ambiguity:

nanaban "a red circle" --json

{
  "status": "success",
  "file": "/Users/you/red_circle.png",
  "model": "google/gemini-3.1-flash-image-preview-20260226",
  "transport": "openrouter",
  "dimensions": { "width": 1024, "height": 1024 },
  "size_bytes": 1247283,
  "duration_ms": 12400,
  "cost_usd": 0.067
}

cost_usd appears when the transport reports it (currently OpenRouter only).

Errors come back in the same shape:

{
  "status": "error",
  "code": "CAPABILITY_UNSUPPORTED",
  "message": "Nano Banana Pro does not support aspect ratio 1:8. Supported: 1:1, 2:3, 3:2, ..."
}

Error codes: AUTH_MISSING, AUTH_INVALID, AUTH_EXPIRED, PROMPT_MISSING, IMAGE_NOT_FOUND, GENERATION_FAILED, RATE_LIMITED, NETWORK_ERROR, MODEL_NOT_FOUND, TRANSPORT_UNAVAILABLE, CAPABILITY_UNSUPPORTED.

Exit codes: 0 success, 1 runtime error, 2 usage error.

Discover everything machine-readably: nanaban agent-info.

Piping

stdout is always just the file path. Metadata goes to stderr. These compose naturally:

nanaban "a cat" | xargs open                              # generate and open
nanaban "a cat" 2>/dev/null | pbcopy                       # copy path to clipboard
cat prompts.txt | while read p; do nanaban "$p"; done      # batch generate

Auto-naming

Your prompt becomes the filename. Common words get stripped, capped at 6 words, joined with underscores:

"a fox in a snowy forest at dawn" -> fox_snowy_forest_dawn.png

Collisions auto-increment: fox_snowy_forest.png, fox_snowy_forest_2.png, fox_snowy_forest_3.png.

Dependencies

Deliberately small:

@google/genai + google-auth-library -- Gemini API access
commander -- CLI parsing (~90KB)
nanospinner -- terminal spinner (~3KB)
picocolors -- terminal colors (~3KB)
tsx + typescript -- runs TypeScript source directly, no build step
OpenRouter -- plain fetch, no SDK

Contributing

Contributions welcome. See CONTRIBUTING.md for guidelines.

License

MIT

Built by Boris Djordjevic at 199 Biotechnologies | Paperfoot AI

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bin		bin
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
gemini-image.ts		gemini-image.ts
generate-image.ts		generate-image.ts
nanaban_logo.png		nanaban_logo.png
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nanaban

What It Looks Like

Why This Exists

Install

Quick Start

Models

Auth

Usage

Flags

Aspect Ratios

Reference Images

Editing Existing Images

For LLM Agents and Scripts

Piping

Auto-naming

Dependencies

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

nanaban

What It Looks Like

Why This Exists

Install

Quick Start

Models

Auth

Usage

Flags

Aspect Ratios

Reference Images

Editing Existing Images

For LLM Agents and Scripts

Piping

Auto-naming

Dependencies

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages