Snappy, the Snaplet cat mascot easily reproduces data-specific bugs with Snapshots

Reproducibility reimagined: Snaplet's the secret sauce for smarter debugging

How Snaplet can help you reproduce a data-specific issue to help you fix bugs faster

We've all been there. Sitting, looking at some potentially errant code, trying to figure out why an inscrutable bug is happening in production, but being unable to reproduce it. Not being able to reproduce an issue can be super frustrating, which is why when debugging, the first step is to recreate the same conditions under which the bug occurred. It's a concept known as reproducibility, and Snaplet is here to help make it easier to achieve for you and your team.

What is reproducibility?

Reproducibility, in the context of software development, refers to the ability to consistently recreate the conditions under which an issue, bug, or unexpected behavior occurs in your application. By achieving reproducibility, you can pinpoint the root cause of the problem, develop a suitable fix, and verify that the issue is resolved, all outside of your production environment.

But why? Why start with reproducing the environment in which the bug happened? Why not just dive into the code in production and start poking around?

A few key reasons why not to fiddle with production:

Isolation of factors and accuracy:

Recreating the environment helps isolate the actual factors that contributed to the bug. There might be multiple factors at play, such as specific data, configurations, or dependencies. Understanding the exact conditions under which the bug appeared helps narrow down the root cause and allows you to focus on resolving the actual issues that caused the bug. By reproducing the environment, you can ensure that you're working with the same data, configurations, and system settings that caused the issue, leading to a more accurate diagnosis and resolution.

Minimizing unintended side effects and verifying the solution:

Debugging in a live production environment can introduce unintended side effects or even cause additional issues. By reproducing the environment, you can safely investigate and test potential solutions without impacting users or causing additional problems in the production system. Once you have identified and fixed the issue, working in a reproduced environment allows you to verify that the solution works under the same conditions that initially caused the bug, and also ensure you don’t introduce new bugs.

As an example, imagine a stock-trading application that processes stock trades for users. Users can place market orders or limit orders to buy and sell stocks. One day, a user reports that their limit order to sell a specific stock was executed at a lower price than the specified limit price, causing them to lose money. To debug, the developer needs to reproduce the environment accurately, considering data and data dependencies, timing and concurrency, third-party API interactions, application configurations, and system infrastructure.

Having production-like data in their debugging environment is crucial, as it helps identify any differences between the user's real-world experience and the intended behavior of the application. Accurate data helps uncover issues related to data integrity, data types, or edge cases that might only emerge with production data.

Getting that production data into your debugging environment is easier said than done, however. With regulations like GDPR and HIPAA in effect, how do you get production-realistic data to your development team without running afoul of compliance and compromising security? Enter Snaplet!

Snappy, Snaplet's cat mascot tries to catch a data-spesific bug
Snappy, Snaplet's cute cat mascot tries to catch a data-spesific bug

What is Snaplet and how do we help?

Snaplet is a tool that helps developers create anonymized, production-like data for development purposes. By de-identifying your production database and creating a snapshot of your data, Snaplet allows you to work with high-quality, production-like data without compromising security and compliance.

Snaplet helps you achieve reproducibility and speeds up debugging by allowing you to define the size of your database sample and which data are captured and transformed, right down to the individual tables, all from within the Snaplet Cloud app. This is called subsetting, and it is great as it allows you to get a useful and representative sampling of data without going through the hassle of pulling an enormous dump of production data.

You can also use Snaplet to anonymize sensitive information to maintain compliance with data privacy regulations. Not only does Snaplet use Copycat to anonymize all the data in your snapshot, but it does so in a way that retains the 'shape' of your data. For example, email addresses will be safely transformed into new values that look like email addresses. The same applies to names, credit card numbers, addresses, and other personally-identifiable information.

Lastly, you can easily share snapshots with your team, ensuring everyone is working with the same data. Snapshots can also be captured on a schedule, keeping everyone coding against the same, up-to-date data.

Achieving reproducibility is an essential aspect of effective software development, especially when debugging complex issues. By using Snaplet to create production-like data for your development environment, you can ensure that you're working with accurate data and maintain compliance with data privacy regulations. With Snaplet, you can simplify the process of achieving reproducibility and streamline your development efforts, so you can spend more time on the things that matter: building great software and delighting your users. Don't let bad data block you ever again!

Jian Reis
January 24, 2022