Jian Reis
August 1, 2022

Scrambling. It's not just for eggs anymore!

Real data in, scrambled data out. Now with exactly the same amount of characters, and the ability to preserve characters.

How is getting production-realistic data into your development environment like the perfect breakfast egg? Everyone has their own preference - scrambled, poached or fried; firm or runny; served on the toast or off to the side. While Snaplet and Copycat already offer a tremendous amount of control and flexibility over how you transform your production data and restore it into whatever database you need it in, we're taking it up a notch with our new feature: copycat.scramble.

The newest addition to the Copycat family, copycat.scramble allows you to take in a string value, and outputs a string of exactly the same length, transforming each individual character along the way. There's some logic involved in the transformations as well:
  • By default, spaces are preserved.
  • Lower case ascii characters are replaced with lower case ascii letters.
  • Upper case ascii characters are replaced with upper case ascii letters.
  • Digits are replaced with digits.
  • Any other ascii character in the code point range 32 to 126 (0x20 - 0x7e) is replaced with either an alphanumeric character, or "_", '-", or "+".
  • Any other character is replaced with a Latin-1 character in the range of (0x20 - 0x7e, or 0xa0 - 0xff).

Why is this useful? When the exact shape (the length in this case) of your data is important, being able to transform your data while preserving the length of the string can be useful. For example - imagine you're trying to debug a problem in a chat-based application. Having real, accurate data in the form of conversation logs would be enormously useful, but you obviously can't use sensitive and protected user chats directly. By transforming that sensitive information via Copycat, you're able to safely populate your database with production-accurate data to debug an issue. With copycat.scramble you can ensure that the transformed data is exactly the same length as the source data, meaning it probably looks and behaves similarly too.

As an added bonus, copycat.scramble also supports the preserve option, allowing you to exclude characters for transformation. Great if you want to keep parts of the string recognizable - like @ signs for emails, for example.

For more information, check out the Copycat documentation.

PS: Worried about your transformation outputs being inferred? It's highly unlikely (and computationally infeasible), but Copycat does supports salt with setSalt - read more about working with sensitive inputs here.

Jian Reis