Implement softmax and cross-entropy loss from scratch

Question

Accepted Answer

Implement softmax and cross-entropy loss in NumPy. Then tell me: what goes wrong with the naive implementation, and how do you fix it? Start with the definition and write the obvious implementation. Then think about what happens when the input contains large positive values — say, 1000. What does `exp(1000)` produce? How do you fix this without changing the mathematical output? **Naive implementation (and why it breaks)** This works for small values but overflows for large inputs: `exp(1000)` ex