Intermediate Big Data & Data Engineering
Q79 / 100

In Spark, what is the difference between "repartition()" and "coalesce()"?

Correct! Well done.

Incorrect.

The correct answer is A) repartition() can increase or decrease partitions and triggers a full shuffle; coalesce() can only decrease partitions and avoids a full shuffle when possible

A

Correct Answer

repartition() can increase or decrease partitions and triggers a full shuffle; coalesce() can only decrease partitions and avoids a full shuffle when possible

Explanation

coalesce() is more efficient for reducing partitions since it minimizes data movement, while repartition() performs a full shuffle and can both increase and decrease partition count.

Progress
79/100