Crapification as Data-Inoculation


0 reactions 2023-01-14

Crapification” is Jeremy Howard’s term for a family of techniques that “take a perfectly good image and make it crappy.”

In the course, what we did was we added JPEG noise to it, and reduced its resolution, and scrolled[?] text over the top of it. The approach that’s used today is actually much more rigorous, but in some ways less flexible. It’s to sprinkle Gaussian noise all over it. Basically, add or subtract random numbers from every pixel.

This is a powerful idea with a long history. Think of it as a kind of inoculation. Instead of introducing a controlled amount of a disease, you introduce a controlled amount of noise to stimulate errors.

It probably feels counterintuitive. Like fake news, it goes against our model of progress. We should be striving for greater accuracy and precision, ever greater optimization. But this is what learning feels like. Introducing friction or asking stupid questions force our cognition engine to shift gears.

Sometimes we see more with less fidelity

We have so much information around us that it’s become boring to talk about why it’s a problem. Too much information prevents us from seeing the big picture, aka a holistic model of the system.

Our tactile senses (like proprioception) don’t seem to have a problem with this. Our body typically knows which way is up. We have a peripheral sense that can alert us to outliers before our conscious mind knows what’s going on.

But an ML model learns only by what it is fed. If it is fed high-fidelity data, it may conclude that details matter more than the overarching concept.

Have some thoughts on this posts?

Reply with an email

Subscribe to the newsletter

Get emails from me about Lorem ipsum dolor sit, amet consectetur adipisicing elit. Libero, ducimus..

3 subscribers including my Mom – 23 issues