Deepfake specialist HunyuanCustom introduces single-image video deepfakes, complete with integrated audio and lip synchronization capabilities.
Tencent unveils HunyuanCustom, a multimodal world model for video customization, challenging the dominance of competitors such as Kling. The new release, an extension of the Hunyuan Video project, presents substantial improvements in video customization capabilities and lip-sync accuracy.
HunyuanCustom allows users to generate high-quality videos from a single image, offering unprecedented flexibility and control in the realm of video synthesis. The system demonstrates impressive performance in video customization, seamlessly integrating subjects with backgrounds while avoiding boundary artifacts and maintaining strong identity preservation.
One of the standout features of HunyuanCustom is its advanced audio-driven human customization, enabling characters to speak in text-described scenes and postures. This area has been a common criticism of previous video synthesis tools, and HunyuanCustom addresses this issue with precision, enhancing the authenticity of generated videos.
The capable lip-sync feature, made possible by HunyuanCustom's audio-driven customization, is expected to set new standards in the industry, given the current market gaps concerning accurate lip-syncing in complex scenes. While Kling, a renowned video editing tool, does not specifically prioritize advanced lip-sync capabilities, HunyuanCustom's emphasis on audio-driven animations hints at its successful implementation in this domain.
Amidst the multitude of comparative video examples available at the HunyuanCustom project site, it is worth noting that only a few projects dare to pit themselves against Kling, the commercial video diffusion API that consistently ranks near the top of leaderboards. Preliminary findings suggest that HunyuanCustom has managed to make significant headway against this formidable incumbent.
In contrast to Kling, which experiences a copy-paste effect when overlaying subjects onto videos, leading to poor integration and noticeable artifacts, HunyuanCustom demonstrates better integration, ensuring seamless background incorporation and maintaining strong identity preservation.
While some video examples at the HunyuanCustom project site may be too wide or high-resolution for standard video players, the innovations made by the new release have the potential to set new benchmarks in the video synthesis industry, particularly with regard to audio-driven customization and lip-sync accuracy.
- HunyuanCustom, built on the foundation of the Hunyuan Video project, showcases technology that sets new standards in video customization, offering unique features like generating high-quality videos from a single image and advanced audio-driven human customization, surpassing competitors such as Kling in certain aspects.
- The technology incorporated in HunyuanCustom, particularly its advanced lip-sync feature, is a notable improvement in the industry, aiming to address common issues with previous video synthesis tools, such as poor lip-syncing in complex scenes, thereby carving a niche for itself amidst competition, including leading platforms like Kling.