Advanced
Big Data & Data Engineering
Q90 / 100
Why is "compaction" important for streaming-ingested data lake tables?
Correct! Well done.
Incorrect.
The correct answer is A) Frequent small writes from streaming create many small files, which hurts read performance; compaction merges them into fewer, larger files
A
Correct Answer
Frequent small writes from streaming create many small files, which hurts read performance; compaction merges them into fewer, larger files
Explanation
The "small file problem" from streaming writes increases metadata overhead and slows scans; periodic compaction jobs merge small files into optimally sized larger files.
Progress
90/100