WiMi Hologram Cloud to Develop a Multimodal Data Processing System for Digital Humans

WiMi Hologram Cloud, an augmented reality technology provider, is developing a multimodal system for processing data in different modalities, such as image, voice, and text, for creating and manipulating digital humans.

The system uses machine learning, natural language processing, computer vision, and other techniques to classify, fuse, and extract features from multimodal data for predictive models and decision systems to make digital humans more realistic and enhance their interaction capabilities.

First, the system uses deep learning, computer vision, and motion capture technologies to recognize and analyze input data. Then, the multimodal data processing system will perform information fusion and decision-making. Finally, the multimodal data processing system will present the output results to the user.

For different types of data, the system will make different output results. For example, the system will make speech output by speech synthesis technology, image output by image rendering technology, and motion track output by animation.

SpeechTek Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues