MLOps Practice | DistillPrep

Live Engine

Select Topic

hardCI/CD for ML

A team's CI/CD pipeline for ML has the following stages: (1) data validation, (2) model training, (3) offline evaluation against a holdout set, (4) model registration if evaluation passes, (5) deployment to production. A critical bug slips through: a feature engineering bug introduces training-serving skew — the preprocessing at training time differs from serving time. All CI gates pass. Why did the CI pipeline fail to catch training-serving skew, and what specific test type closes this gap?

Code

raw_sample = {"x": 100.0, "y": 50.0}
  
  train_features = training_preprocessor.transform(pd.DataFrame([raw_sample]))
  serve_features = serving_preprocessor.transform(raw_sample)  # or gRPC/REST call
  
  assert train_features == serve_features, \
      f"Training-serving skew detected: {train_features} != {serve_features}"

raw_sample = {"x": 100.0, "y": 50.0} train_features = training_preprocessor.transform(pd.DataFrame([raw_sample])) serve_features = serving_preprocessor.transform(raw_sample) # or gRPC/REST call assert train_features == serve_features, \ f"Training-serving skew detected: {train_features} != {serve_features}"