Advanced Artificial Intelligence & Machine Learning
Q92 / 100

What is multi-modal learning in AI?

Correct! Well done.

Incorrect.

The correct answer is B) Training models that process and relate multiple data modalities (text, images, audio, video) within a unified architecture

B

Correct Answer

Training models that process and relate multiple data modalities (text, images, audio, video) within a unified architecture

Explanation

Multi-modal models (GPT-4V, CLIP, Gemini, Flamingo) learn joint representations across modalities. CLIP learns image-text alignment via contrastive learning. Enables zero-shot image classification, image captioning, visual QA.

Progress
92/100