When should you use Spark's "cache()" or "persist()" on a DataFrame?

Correct! Well done.

Incorrect.

The correct answer is A) When the same DataFrame will be reused multiple times, to avoid recomputing it from scratch each time

Correct Answer

When the same DataFrame will be reused multiple times, to avoid recomputing it from scratch each time

Explanation

Caching stores intermediate results in memory (or disk) so repeated actions on the same DataFrame avoid recomputation, at the cost of memory usage.

Progress

61/100

Browse All Big Data & Data Engineering Questions

100 questions · beginner to advanced