How do Elixir supervisors implement fault tolerance?
Answer
Supervisors implement fault tolerance through the supervision tree: a hierarchy where each supervisor monitors its children and applies restart strategies. The key mechanism: Process linking: spawn_link links parent and child — if the child crashes, the parent receives an exit signal. Supervisors trap exits (Process.flag(:trap_exit, true)) and handle them. Restart strategies: :one_for_one (only the crashed process restarts), :one_for_all (all children restart — for tightly coupled processes), :rest_for_one (crashed process and later-started siblings restart). Restart intensity limits: max restarts within a time window — if exceeded, the supervisor itself crashes (propagating up the tree). Dynamic supervisors (DynamicSupervisor): start children at runtime (e.g., one process per connected user). The entire application is a tree of supervisors — a crash anywhere causes a localized recovery rather than system-wide failure. This architecture is why Erlang/Elixir systems can achieve "nine nines" uptime.
Previous
What are Elixir protocols?
Next
What is Phoenix Channels and how do WebSockets work in Elixir?