Here at Snaplet, life is all about PII. In fact, our whole team is obsessed with it. I'm not talking about the lobster pie that Snappy (head of naps and smiles) loves so much. What I'm referring to, is a whole different recipe, and if not handled with care, could leave a bad taste in your mouth. So what is PII then? Let's delve in.
PII is short for Personally Identifiable Information, also known as personal data. PII seems to have quite a broad definition, but it generally refers to any information that can be used to identify a living person. This could be as straightforward as a name but also include abstract concepts like IP addresses, geolocation, biometric, and behavioural data and even online identifiers such as cookies.
The most common PII fields in datasets include names, addresses, emails, passwords, telephone-, passport-, driver's license-, credit/debit card-, and social security numbers as well as race, age, gender, job position, workplace and educational history.
Data has become a permanent fixture in most companies. Large amounts of personal data is stored daily–be it on a local server or somewhere in the cloud. Something that, on its own may not be sensitive information, could turn into PII as soon as a secondary piece of information is made available that could identify a certain individual. Companies collect seemingly unimportant information that, when not handled with care, could end up in the wrong hands. Some companies even sell data for large amounts. Criminals can use personal information to set up fake accounts, steal proprietary information, commit fraud etc. For those individuals whose information is being used, stored or sold, this is worrying.
To protect these individuals, the European Union (EU) has taken measures to define and protect personal data. The General Data Protection Regulation (GDPR) came into effect on May 25, 2018 and impacts anyone that uses personal data of EU residents. The GDPR seems to be the gold standard since it protects the data privacy of EU citizens and residents no matter where in the world the company using that data is located. Since then, various countries have set up similar legislation, such as the California Consumer Privacy Act (CCPA) in the United States, the Lei Geral de Prote o de Dados Pessoais (LGPD) in Brazil, and the Protection of Personal Information Act (POPIA) in South Africa. These laws impose fines for non-compliance and data breaches.
Why do we care?
As a web developer, app creator or product owner, you're aware that some traces left behind by users of your product, could be sensitive in nature. Security and legal compliance become increasingly important if these traces can be used to identify individuals. Since real-world data (or as close to as possible) is needed to build and test apps and products, privacy issues come in as early as research and development. Cleaning up sensitive data is a tedious task that requires a lot of work. It is also difficult to maintain as databases grow. Because it is such a schlep, some software developers copy their production databases to their personal laptops. This is a slow and painful process that involves large amounts of data being downloaded. We know however, that most developers really take data privacy issues seriously and that is why we created Snaplet.
How we help
Snaplet is a tool for developers that helps them create copies of production databases with de-identified data for usage during development or testing. We help them identify sensitive data in a database, make it quick and easy to anonymize PII fields and reduce restoration time by eliminating unnecessary tables. The snapshots contain a de-identified copy of their database in the cloud, which means they can test safely from their local development machines. We take the schlep out of anonymization. Now that’s easy as PII.