This is blog post was inspired by posts by Mitchell Davis of The Privacy Rights Council.
The General Data Protection Regulation (GDPR) is a European Union (EU) law coming into effect on May 25th 2018. It’s intended to protect the data privacy of EU residents, including the export of personal data outside the EU. Because it extends to a resident’s professional life, it will affect firms that sell business contact information of European firms and their employees.
Enforcement will start country-by-country in Europe (Germany being aggressive, UK being lax) with regulators first looking at firms that completely ignore the regulation. Being proactive about GDPR is, therefore, key because no one knows exactly how this will play out. Those who are trying their best to comply will likely be spared the EU’s initial regulatory wrath.
So, for purveyors of directory and mailing list data I suggest the following preemptive approach to the “notice requirements” on the retention time for contact information. Specifically, data content producers can take the “Best if Used By Date” approach mandated by food regulators. In other words, contact data for organizations and business executives that “goes bad” an average of every six months or so due to job title changes, job moves, corporate relocations, corporate restructuring, etc. could be appended with a “production date” from which an “expiration date” could then be assigned.
The idea of adding such provenance metadata to commercial databases:
- would show a primary source where a user could see if the data were still accurate;
- is relatively easy to comply;
- would spur providers to update their data more frequently; and,
- would pave the way for the eventual forensic analysis of algorithms within information systems that depend on the underlying data. (The algorithmic implications are not so great for contact data but are substantial for providers of firmographic data on company size, facility size, etc. For instance, if a country regulator sends out a legal notice to the water quality officers at all water treatment plants larger than X then the source of the value “X” is quite important.)
Data anonymization (called “pseudonymization” by the EU) is another fascinating area covered by the GDPR. It’s generally agreed that data in its aggregate or otherwise anonymized form (i.e., not directly related to a specific person or firm) is incredibly valuable to analysts and regulators, as well as being very important data worthy of protection. Ideally, an organization would own and store its own data locally and would only release selected (encrypted) data points to authorized requesters who were pre-approved for its use in aggregated or anonymous applications. It is assumed that all public institutions would be ‘open’ to all requests by default so European government organizations themselves could actually spearhead this long-term effort.
An effective anonymized data-sharing framework also sidesteps the issue of inadvertent data exposure and is consistent with the related EU principle of ‘data portability’—the ability to take one’s data and give it to service providers with a set of permissions related to its use and protection. Portability would allow individuals to easily sign on to new personal services and organizations could quickly submit their own data to auditors, insurers, etc. without any wasted internal manpower expenses.
In conclusion, I’m usually very apprehensive about the EU’s desires to regulate things without fully understanding them, but I see some reasons for hope in the principles espoused by the EU in this case. If the owners of information services take the proactive first steps toward transparency on data provenance (acknowledging, via text footnotes, that this is being done in order to comply with GDPR regs) I am cautiously optimistic that Europe’s regulators will initially err on the side of caution and will then, over time, side with Europe’s active open data communities to allow this effort to move forward in a logical, non-onerous way that will benefit both the owners and the users of data.
For more info: