Advanced Artificial Intelligence & Machine Learning

Q96 / 100

What is Constitutional AI and RLAIF vs RLHF?

Correct! Well done.

Incorrect.

The correct answer is B) RLHF uses human preferences to train a reward model; RLAIF (RL from AI feedback) uses AI-generated preferences, enabling scale without proportional human annotation cost

B

Correct Answer

RLHF uses human preferences to train a reward model; RLAIF (RL from AI feedback) uses AI-generated preferences, enabling scale without proportional human annotation cost

Explanation

RLHF: human raters compare outputs → reward model → PPO optimization. RLAIF: the AI itself rates outputs against constitutional principles → reward model → RL. Scales better but depends on AI judgment quality.

Previous All Questions Next

Progress

96/100

Browse All Artificial Intelligence & Machine Learning Questions

100 questions · beginner to advanced