Advanced
Artificial Intelligence & Machine Learning
Q83 / 100
What is the attention mechanism complexity and how does sparse attention address it?
Correct! Well done.
Incorrect.
The correct answer is B) Standard self-attention is O(n²) in sequence length, making long sequences expensive. Sparse attention (Longformer, BigBird) reduces to O(n√n) using local and global attention patterns
B
Correct Answer
Standard self-attention is O(n²) in sequence length, making long sequences expensive. Sparse attention (Longformer, BigBird) reduces to O(n√n) using local and global attention patterns
Explanation
O(n²) attention quadruples cost when doubling sequence length. FlashAttention (IO-aware) keeps O(n²) compute but reduces memory bandwidth by computing in tiles. Sparse methods genuinely reduce compute for long sequences.
Progress
83/100