Intermediate Big Data & Data Engineering
Q61 / 100

When should you use Spark's "cache()" or "persist()" on a DataFrame?

Correct! Well done.

Incorrect.

The correct answer is A) When the same DataFrame will be reused multiple times, to avoid recomputing it from scratch each time

A

Correct Answer

When the same DataFrame will be reused multiple times, to avoid recomputing it from scratch each time

Explanation

Caching stores intermediate results in memory (or disk) so repeated actions on the same DataFrame avoid recomputation, at the cost of memory usage.

Progress
61/100