Here at Snaplet, life is all about PII. In fact, our whole team is obsessed with it. I'm not talking about the lobster pie that Snappy (our CEO of naps and smiles) loves so much. What I'm referring to, is much less tasty, spelled a bit differently, and if not treated with care, could be a recipe for disaster. So what is PII then? Let's delve in.
PII is short for Personally Identifiable Information, also known as personal data. PII seems to have quite a broad definition, but it generally refers to any information that can be used to identify a living person. This could be as straightforward as a name but also include abstract concepts like IP addresses, geolocation, biometric, and behavioural data and even online identifiers such as cookies.
The most common PII fields in datasets include names, addresses, emails, passwords, telephone-, passport-, driver's license-, credit/debit card-, and social security numbers as well as race, age, gender, job position, workplace and educational history.
Data has become a permanent fixture in most companies. Large amounts of personal data is stored daily–be it on a local server or somewhere in the cloud. Something that, on its own may not be sensitive information, could turn into PII as soon as a secondary piece of information is made available that could identify a certain individual. Companies collect seemingly unimportant information that, when it ends up in the wrong hands, could be devastating. Some companies even sell data for large amounts. Criminals can use personal information to set up fake accounts, steal proprietary information, commit fraud and worse. For those individuals whose information is being used, stored or sold, this is worrying.
To protect these individuals, the European Union (EU) has taken measures to define and protect personal data. The General Data Protection Regulation (GDPR) came into effect on May 25, 2018 and impacts anyone that uses personal data of EU residents. The GDPR seems to be the gold standard since it protects the data privacy of EU citizens and residents no matter where in the world the company using that data is located. Since then, various countries have set up similar legislation, such as the California Consumer Privacy Act (CCPA) in the United States, the Lei Geral de Prote o de Dados Pessoais (LGPD) in Brazil, and the Protection of Personal Information Act (POPIA) in South Africa. These laws impose heavy fines for non-compliance and data breaches.
Why do we care?
As a web developer, app creator or product owner, you need to be aware that some traces left behind by users of your product, could be sensitive in nature. Security and legal compliance become increasingly important if these traces can be used to identify individuals. This means that you or your company could be held accountable for data breaches or privacy violations. Since real-world data (or as close to as possible) is needed to build and test apps and products, the potential for violating these laws come in as early as research and development. Cleaning up sensitive data is a tedious task that requires a lot of work. It is also difficult to maintain as databases grow. Because it is such a schlep, a large number of software developers copy their production databases to their personal laptops. Apart from being illegal, this is a slow and painful process that involves large amounts of data being downloaded. That is why we created Snaplet.
How we help
Snaplet is a tool for developers that helps them create copies of production databases with mock data for usage during development or testing. We help them identify sensitive data in a database, make it quick and easy to anonymize PII fields and reduce restoration time by eliminating unnecessary tables. The snapshots contain a de-identified copy of their database in the cloud, which means they can test safely from their local development machines. We take the schlep out of anonymization. Now that’s easy as PII.