Advanced
Artificial Intelligence & Machine Learning
Q92 / 100
What is multi-modal learning in AI?
Correct! Well done.
Incorrect.
The correct answer is B) Training models that process and relate multiple data modalities (text, images, audio, video) within a unified architecture
B
Correct Answer
Training models that process and relate multiple data modalities (text, images, audio, video) within a unified architecture
Explanation
Multi-modal models (GPT-4V, CLIP, Gemini, Flamingo) learn joint representations across modalities. CLIP learns image-text alignment via contrastive learning. Enables zero-shot image classification, image captioning, visual QA.
Progress
92/100