What is gradient vanishing and exploding?

Question

Accepted Answer

During backpropagation, gradients are multiplied together as they flow back through layers. In deep networks, if weights have values &lt; 1, gradients become exponentially smaller (vanishing gradients) — early layers learn very slowly. If weights &gt; 1, gradients explode (exploding gradients) — training becomes unstable. Solutions: better activation functions (ReLU instead of Sigmoid/Tanh), careful weight initialization (He init, Xavier init), batch normalization, residual connections (skip

What is gradient vanishing and exploding?

Answer

More Machine Learning / AI Questions