TongFlow v0.1.0 is now public on GitHub at tong-io/tongflow, under AGPL-3.0. One Docker command, your own machine, every modality on one canvas.

What it is

TongFlow is a multi-modal AIGC studio built around a single idea: every AI model is a modality transform. A text-to-image model is text → image. A speech recognizer is audio → text. A 3D generator is image → 3D. Wrap each one as a node with typed inputs and outputs, drop them onto an infinite canvas, and you have a creative pipeline you can see, edit, and share.

Three verbs cover the whole interface:

Add materials onto the canvas: text, image, photo, sketch, audio, video, document, URL, or 3D model.
Transform them between modalities: text-to-image, image-to-video, audio-to-text, image-to-3D, and so on.
Combine the results: image fusion, lip sync, voice cloning, character swap, motion transfer.

No complex parameter panels, no manual node wiring. Drop something on the canvas, pick the next step from the Smart Island, and the connection is made for you.

What you can build

A few patterns the v0.1.0 graph already supports end-to-end:

Talking-head videos — script → speech → image → lip-synced video, all on one canvas.
Short films from a paragraph — text → scene images → image-to-video → concatenated cut.
E-commerce visuals at scale — drop a product photo and a reference, run image fusion across a batch, get clean variants.
Original music from a prompt — ACE-Step turns a text description into a finished track.
AI comics and shorts — story prompt → panel images → arrangement → optional voiceover.
Character animation — bring a still character into motion using motion-transfer or character-swap nodes.

What’s in v0.1.0

7 input types on the canvas: text, image, audio, video, document, URL, 3D model.
Image transforms: generation, edit, captioning / Q&A, upscale, image-to-3D.
Video transforms: text-to-video, image-to-video, first/last-frame interpolation, description, upscale, frame extraction, subtitle removal, watermark removal.
Audio transforms: music generation, speech synthesis (preset / voice clone / instruction), speech recognition, noise reduction, speaker diarization, voice replacement.
Combine nodes: image fusion, lip sync (audio+image→video, audio+video→video, audio+text→video), voice cloning, character swap, motion transfer, text merging.
Backend models, named, not hidden: Z-Image, FLUX.2 Klein 9B, LTX-2, SeedVR2, Gemma 4, Qwen3, ACE-Step.
Extensible by design: new transforms plug in via the ABI (config/tongflow.abi.json) and the plugin scanner. Add your own model, your own slot, your own workflow.
Self-host via one Docker command: git clone + docker compose up.

Privacy by construction

No accounts. No central CDN. No telemetry. Your workflows and uploaded files live in a local SQLite database and on local disk — under your control, on your machine. The studio talks to exactly two outside services: Modal for GPU workers and one LLM provider of your choice (OpenRouter, Gemini, OpenAI, or DeepSeek). You bring the API keys; nothing routes through us.

How to try it

git clone https://github.com/tong-io/tongflow
cd tongflow
docker compose up

You’ll need:

Docker (Compose v2)
A Modal account and token — the free tier ($30/month credit) includes plenty of GPU time for everyday work
One LLM API key: OpenRouter, Gemini, OpenAI, or DeepSeek

Set the env vars from .env.example, then open http://localhost:3000. The Getting Started doc walks through the first workflow.

Or skip the install and use the hosted studio at https://app.tongflow.com — same canvas, same nodes, ready in the browser.

Join the community

GitHub Issues: feature requests, ideas, bug reports — github.com/tong-io/tongflow/issues.
Pull requests welcome: see CONTRIBUTING.md. The plugin scanner + ABI makes adding new transform nodes a contained change.
Discord: real-time chat at discord.gg/K7V8az94Zf.

If you build something with TongFlow, we’d love to see it. And if the project is useful to you, a star on the repo makes a real difference.

TongFlow v0.1.0 is open source

What it is

What you can build

What’s in v0.1.0

Privacy by construction

How to try it

Join the community

Related Posts

TongFlow v0.1.0 をオープンソース化しました

TongFlow v0.1.0 开源了