Lessons from the trenches: fake tests, real problems.

Why you should stop using inaccurate data for end-to-end testing

Writing an end-to-end (E2E) test is a lot like, well, writing a regular test - the kind you had in school: not particularly fun. Granted, there’s precious few developers who genuinely enjoy writing tests, but the value of a good test is undeniable: verifying that whatever we’re building actually behaves the way it ought to, from start to finish, before it goes into production. Even with how challenging they can be to implement, it’s still strange that reliable E2E tests continue to be out of reach for so many software development teams.

Why is that? We happened to spot DavidKPiano from Stately.ai on Twitter asking exactly this question:

While several developers mentioned the complexity and time investment required, especially for  applications consisting of multiple services, one key issue kept being raised: getting production-realistic data in your E2E tests is really hard:

Getting production or production-like data to test against just once is hard enough. Getting it continuously so it’s up-to-date, making sure it’s safe to use (with no compromising personal information), and integrating it automatically into CI/CD is a Herculean ask. The problem is, if your E2E tests aren’t using data that looks and behaves like production data, you’re setting yourself up for problems. At best, you’ve got tests that pass but no guarantee that you won’t experience problems in production, and at worst, critical bugs are going to slip through your tests.

Imagine for a moment that you're working on a social networking application, building a feature called 'People You May Know', which helps users make new connections. The algorithm for this feature is intricate, considering and weighing multiple factors such as mutual connections, shared interests, geographical location, and more. When it's finally time to test, you seed your database with a few dozen user profiles with basic, straightforward connections and interests.Initially, everything seems fine. Your E2E tests are designed around this simple data set, and they confirm that 'People You May Know' suggestions appear and that they're based on mutual connections. All tests pass. Everything looks great - confidence is high, and you deploy to production.

But once live, you find that users are complaining about nonsensical or even inappropriate suggestions. On closer inspection, you realize your algorithm didn't account for complex scenarios: What if users have hundreds or thousands of connections? What if they have overlapping but fundamentally different interests, like 'gardening' and 'botany,' or 'rock music' and 'classical music'? Your seed script didn't include such nuanced data, so your E2E tests gave you a false positive: they said everything was fine when it wasn't.

Here's where Snaplet in your CI shines though. Snaplet doesn't give you just any data; it gives you a snapshot of your real, production data, obfuscated to protect privacy but complex enough to reflect reality. Had your E2E tests run on a Snaplet-seeded database, your tests would have had to reckon with the kind of rich, intricate user data that your algorithm will actually encounter in the real world. Any shortcomings in how your feature handles this complex data would be revealed before deployment, not after.

Snaplet doesn't just help you find bugs; it helps you understand how your application will truly perform when it matters most— in the hands of real users. That's a level of assurance that seed scripts or fake data simply can't provide in your E2E tests.

It’s also worth noting that the Snaplet platform is built for scale and speed. With compressed snapshots courtesy of Snaplet’s subset feature, you can run more tests in less time without compromising the accuracy of those tests. That means faster iterations, quicker deployments, and a product that's robust because it's been tested against the closest thing to reality.

The bottom line? Snaplet fills in the gaps that existing E2E testing methods miss by ensuring you test against data that's as real as it gets, while still being safe. Your tests can only be as good as the data they run on - without real data, you’re just running fake tests, which only leads to real problems.

Jian Reis
January 24, 2022