Advanced Big Data & Data Engineering
Q90 / 100

Why is "compaction" important for streaming-ingested data lake tables?

Correct! Well done.

Incorrect.

The correct answer is A) Frequent small writes from streaming create many small files, which hurts read performance; compaction merges them into fewer, larger files

A

Correct Answer

Frequent small writes from streaming create many small files, which hurts read performance; compaction merges them into fewer, larger files

Explanation

The "small file problem" from streaming writes increases metadata overhead and slows scans; periodic compaction jobs merge small files into optimally sized larger files.

Progress
90/100