Intermediate
Big Data & Data Engineering
Q61 / 100
When should you use Spark's "cache()" or "persist()" on a DataFrame?
Correct! Well done.
Incorrect.
The correct answer is A) When the same DataFrame will be reused multiple times, to avoid recomputing it from scratch each time
A
Correct Answer
When the same DataFrame will be reused multiple times, to avoid recomputing it from scratch each time
Explanation
Caching stores intermediate results in memory (or disk) so repeated actions on the same DataFrame avoid recomputation, at the cost of memory usage.
Progress
61/100