What is batch normalization?
Why Interviewers Ask This
This tests whether you can apply Machine Learning / AI knowledge to real-world scenarios. Interviewers are looking for clarity of thought and evidence that you've encountered this in production code.
Answer
Batch Normalization (BatchNorm) normalizes the activations of each layer to have zero mean and unit variance, computed over the mini-batch, then scales and shifts with learnable parameters γ and β. Benefits: accelerates training by allowing higher learning rates, reduces sensitivity to weight initialization, acts as a regularizer (reduces need for dropout), and mitigates the internal covariate shift problem (distribution of layer inputs changing during training). It is placed after the linear transformation and before the activation function in practice.
Common Mistake
Don't just define the term — demonstrate that you understand when to use it and when not to. Showing awareness of trade-offs is what separates average from strong Machine Learning / AI candidates.
More Machine Learning / AI Questions
View all →- Intermediate What is a convolutional neural network (CNN)?
- Intermediate What is a Recurrent Neural Network (RNN)?
- Intermediate What is an LSTM and how does it solve the vanishing gradient problem?
- Intermediate What is the attention mechanism in neural networks?
- Intermediate What is the Transformer architecture?