mlprep
mlprep/ML Breadthhard12 min

You are training a fraud, churn, or conversion model where labels arrive days or weeks later. What can go wrong?

formulate your answer, then —

tldr

Delayed labels create censoring: recent examples that look negative may later become positive. Define prediction time, label window, and maturity window explicitly. Monitor label completeness by segment and avoid training or evaluating on immature labels.

follow-up

  • How would delayed conversion labels bias a recent training window?
  • What is right censoring?
  • How do you evaluate an experiment when the final label takes 30 days to observe?