Explain data augmentation as regularization. How do you decide which augmentations are valid for a task?
formulate your answer, then —
tldr
Data augmentation improves generalization by teaching label-preserving invariances. It helps when transformations reflect real deployment variation. It hurts when transformations change the label, remove key evidence, or create unrealistic training examples.
follow-up
- Why is augmentation easier in vision than in NLP?
- How would you validate that an augmentation policy is safe?
- What are Mixup and CutMix, and why can they improve robustness?