Kling AI Launches Kling Video 2.6 Model

Kling AI has released the Kling Video 2.6 Model enabling the simultaneous generation of visuals, natural voiceovers, sound effects, and ambient atmosphere in a single pass.

The Kling Video 2.6 Model upgrades text-to-audio-visual and image-to-audio-visual generation. Whether inputting text or combining images with prompts, users can directly generate videos complete with speech, sound effects, and ambient sounds. The model currently supports Chinese and English voice generation, creating video content up to 10 seconds in length.

Leveraging deep semantic alignment between real-world sounds and dynamic visuals, the Kling Video 2.6 Model achieves tight coordination between voice rhythm, ambient sound, and visual motion and fully comprehends textual descriptions, colloquial expressions, and complex storylines across varied scenarios.

The Kling Video 2.6 Model supports the generation of stand-alone or combined audio types, including speech, dialogue, narration, singing, rap, ambient sound effects, and mixed sound effects for video content creation for advertising, marketing, social media, and e-commerce. Its multi-character dialogue capability allows creators to produce a variety of content, including interviews, scripted performances, and comedy skits. In addition, its music performance capabilities enable diverse creative expressions, including singing, rap, and instrumental performances.

Kling AI Launches Kling Video 2.6 Model

ServiceNow Partners with OpenAI on Voice AI

FlashLabs Releases Chroma 1.0 Voice AI Model

Agora Partners with MiniMax on Voice AI

VoiceRun Launches Voice AI Platform with $5.5 Million Seed Round