What does a CI/CD pipeline for ML models look like (model versioning, data validation, shadow deployment)?

Answer

CI/CD for ML (called MLOps or CD4ML) extends software CI/CD with additional concerns unique to machine learning. Data validation: before training, validate the training dataset for schema drift, missing values, and statistical distribution shifts using tools like Great Expectations or TFX Data Validation. Model training: the pipeline triggers training on new data or code changes, tracking experiments and hyperparameters in MLflow or Weights & Biases. Model evaluation: automatically evaluate the trained model against a held-out test set and compare metrics (accuracy, F1, AUC) to the current production model — only promote if the new model is better. Model versioning: store model artifacts in a model registry (MLflow Model Registry, Vertex AI Model Registry) with metadata, provenance, and approval workflows. Shadow deployment: deploy the new model alongside production (receiving a copy of traffic via mirroring) to compare predictions and latency in real production conditions before switching traffic. Continuous monitoring: after full deployment, track prediction distribution, data drift (feature statistics diverging from training distribution), and model performance degradation, triggering automatic retraining or rollback when metrics degrade.