ByteDance unveils 5 AI models with 30-second video at 80% lower cost

ByteDance shattered the 30-second barrier in AI video generation and matched Claude Opus 4.7 in coding at one-fifth the price, unveiling five new models at its annual FORCE conference on June 23.

"Seedance 2.5 is the first video generation model to produce native 30-second clips from a single prompt, with scene changes and tempo shifts built in," Tan Dai, president of Volcano Engine, ByteDance's cloud business, said at the conference in Beijing. "It can accept up to 50 multimodal reference inputs simultaneously — images, audio, 3D models — and supports localized editing after generation without degrading visual consistency."

The centerpiece of the release, Seedance 2.5, generates single video clips up to 30 seconds long at native 4K resolution with 10-bit color depth, a leap from the 15-second ceiling that has constrained most AI video tools. The model also introduces 3D white-model pre-visualization — a feature inspired by a film director's request during collaboration with ByteDance, according to CEO Liang Rubo. Users can edit individual elements such as backgrounds or products after generation without regenerating the entire clip, a capability ByteDance demonstrated by swapping lipstick shades in a commercial without altering the scene. The model is expected to launch in early July.

The competitive stakes extend well beyond video. Doubao 2.1 Pro, ByteDance's flagship language model, scored 59.8 on the SciCode scientific reasoning benchmark, surpassing both Claude Opus 4.7 and GPT-5.5, and achieved a 47 on NL2Repo repository-level code generation — ahead of GPT-5.5 and Gemini 3.1. Its pricing of 6 yuan ($0.83) per million input tokens and 30 yuan ($4.14) per million output tokens represents an approximately 80 percent cost reduction versus Anthropic's Claude Opus series, according to Volcano Engine. A turbo variant priced at half the Pro tier targets high-frequency enterprise workloads.

The full-stack AI offensive

ByteDance did not stop at text and video. The company also previewed Seedream 5.0 Pro for image generation, which adds interactive editing — users can draw arrows or circle regions to modify specific elements — and multi-layer separation that recursively splits image layers while auto-filling backgrounds. The model supports high-density text layouts in more than 10 languages, including English, Spanish, Arabic and Japanese, with culturally adapted typography.

On the audio front, Doubao Audio Generation Model 1.0 generates complete cinematic soundtracks from text alone, automatically inferring character voice characteristics, emotional delivery, dialect accents, background ambiance and sound effects in a single pass. A demo showed a nearly one-minute martial arts sequence with consistent character voices, rain ambience and weapon clash sounds — all model-generated without manual layering.

Seedance 2.0, the predecessor model released in February, received a native 4K upgrade as part of the announcement.

Commercial traction and enterprise adoption

Volcano Engine's cloud business now commands 49.5 percent of China's public cloud market, Tan said. Daily token calls across ByteDance's Doubao model family have reached 180 trillion, a 1,500-fold increase from two years ago and a tenfold increase in the past year alone. The number of enterprise customers spending more than 1 trillion tokens annually has doubled to 200 since December.

ByteDance also launched an AI copyright commercialization platform, with Hong Kong filmmaker Stephen Chou as its first partner. Users can remix clips from Chou's classics including "The God of Cookery" and "CJ7" using official templates on Douyin, Jimeng and Jianying — generating more than 10,000 creations on the first day, according to Tan.

Enterprise adoption spans multiple industries. Tesla has integrated Doubao for voice-based vehicle controls across its full lineup, using ByteDance's real-time speech model. Mercedes-Benz's new electric CLA also embeds Doubao for natural-language interaction and emotional recognition. In financial services, CICC built a digital investment advisor agent on ByteDance's HiAgent platform, distilling research from more than 300 analysts. China Mobile jointly launched a confidential model service for government and financial clients using domestic computing infrastructure.

What it means for investors

ByteDance's full-stack AI release — spanning text, video, image and audio — signals a pricing and capability war that pressures both Western AI leaders and Chinese rivals. Doubao 2.1 Pro's coding parity with Claude Opus 4.7 at 80 percent lower cost compresses margins for premium-tier model providers, while Seedance 2.5's 30-second generation capability leapfrogs OpenAI's Sora and other competitors that remain capped at 15 to 20 seconds. The company's 49.5 percent public cloud market share and 180 trillion daily token calls suggest enterprise adoption is accelerating faster than most analysts projected. For investors tracking the AI infrastructure buildout, ByteDance's ability to bundle models across modalities at aggressive price points — combined with its distribution through Douyin, Jimeng and Jianying — creates a vertically integrated competitor that rivals the scale of any Western AI platform.

This article is for informational purposes only and does not constitute investment advice.