A tribute to the AI giant: the bitter lesson http://www.incompleteideas.net/IncIdeas/BitterLesson.html

One of the most important lessons from real-world recommendation systems — a lesson we have only partially learned — is that good content doesn’t always shout. Sometimes it grows quietly, in spurts, in odd rhythms, and in contexts we didn’t fully control. In a world with limited resources and noisy feedback, we are tempted to make snap judgments: "this video looks slow, it must be weak" — but this is a cognitive shortcut, not a strategy.

The harsh truth is that most cold-start recommendation systems are designed to optimize for early performance, not true potential. They rely on strong, immediate feedback — often through boosting — to decide whether to promote a video or drop it. But the signals gathered under boosting are flawed: they are biased, inflated, and context-dependent. They reflect system behavior more than user preference. We know this, and yet we rely on these signals, because they are abundant and fast.

On the other hand, organic signals — the ones that emerge when content competes naturally — are far more truthful. They are what users do when no one is watching. But they are also slow, noisy, and sparse. We don’t like them because they don’t tell us what we want to know quickly. We discard them when they seem weak, and in doing so, we often discard the content that needs just a bit more patience.

This is the signal dilemma in cold-start:

Do we believe the noisy-but-rich signal that comes from intervention?
Or the clean-but-sparse signal that emerges without it?

We propose a different mindset. Instead of forcing a binary decision early — boost or not, keep or kill — we treat cold-start as a sequential information-gathering process. Each round of boosting is not a judgment, but a probe. Each round of organic exposure is not a verdict, but a test. And the sequence of these signals, taken together, tells a deeper story than any single spike ever could.

In our method, we let each piece of content go through multiple rounds of boosting and observation. We do not rush to conclusions. After each phase, we ask: did this content show resilience? Did it grow without help? Did it retain attention when we stopped pushing it? If so, we boost again. If not, we walk away.

This is not about learning a better scoring model. It is about building a protocol for trust — where trust is earned through growth under constraint.

Over time, we learn to spot content that grows naturally. We allocate more of our limited boost to those pieces, because they prove themselves. Not because they screamed the loudest under spotlight, but because they walked steadily under the sun.

The lesson is simple but hard to implement:

Don’t just measure what happens under control. Observe what happens when you step back.

Only then can we build systems that find not just what performs, but what endures.