Intermediate
Big Data & Data Engineering
Q79 / 100
In Spark, what is the difference between "repartition()" and "coalesce()"?
Correct! Well done.
Incorrect.
The correct answer is A) repartition() can increase or decrease partitions and triggers a full shuffle; coalesce() can only decrease partitions and avoids a full shuffle when possible
A
Correct Answer
repartition() can increase or decrease partitions and triggers a full shuffle; coalesce() can only decrease partitions and avoids a full shuffle when possible
Explanation
coalesce() is more efficient for reducing partitions since it minimizes data movement, while repartition() performs a full shuffle and can both increase and decrease partition count.
Progress
79/100