It is widely believed that big data constitutes a threat to personal privacy. The vast amounts of poorly managed information available allows for privacy breaches to occur. To make privacy a possibility, the gathered information should be managed in such a way that the data is transformed into anonymized data by removing personally identifiable information.
However, many believe that actual de-identification of data is not possible nor is it always plausible as situations where re-identification of data exist. Some go as far as to say that anonymity is a myth. Moreover, with the spread of information online, it is possible to re-identify previously ‘anonymized’ data. It has been calculated that every 4 in 10,000 individuals that appear in anonymized datasets in accordance with HIPPA standards can be re-identified.
The number of individuals being reidentified signifies that anonymity is possible to a certain extent. To provide individuals with a sense of privacy, organisations are required to do a threat analysis on datasets before releasing it to the public and to check for datasets available online that can be used to re-identify the people in the dataset. However, not many organisations carry this out as it is a labour-intensive and a very time-consuming process that entails serious data management and statistics skills. Organisations should constantly reassess de-identification and re-identification strategies and techniques as technology improves to make sure that public datasets can be used in the future as well.
Anonymisation and pseudonymisation of data can be quite effective when carried out properly. It protects the privacy rights of individual data subjects and allows organizations to balance this right to privacy against their legitimate goals. The anonymisation and pseudonymisation techniques can constitute a part of privacy-by-design strategy, which provides advanced protection for data subjects, be part of a risk minimisation strategy when sharing data with data processors or other data controllers, or a way to avoid unintended data breaches happening when the staff is accessing personal data. It could also be a part of a data minimisation strategy aimed at minimizing the risks of a data breach for data subjects.
To ensure your data is anonymous for data protection purposes, an examination of the means and available datasets that can be used to re-identify a data subject should be carried out. Anonymisation strategies should be done on a case-by-case basis, taking into consideration the risks that make up the most relevant factors in playing a role as well as the purpose for anonymisation. Nevertheless, even with a detailed risk assessment, it is not certain that an individual will never be re-identified from an anonymised dataset.
Organizations should keep in mind that the less information they provide on individuals, the less likely it is for re-identification to occur as connecting the data becomes difficult. It is safe to say that the relationship between re-identification and data availability is proportional in that the more data is made readily available, the higher the chances for re-identification. Moreover, data breaches increase the chances for re-identification, regardless of location.
Having a well-organized data management strategy is essential as it ensures efficient data movement with minimal risk. This ties to having real-time delivered data analytics, which can keep data on a ‘need-only’ basis and discard any that is no longer relevant, thus minimizing the data available and decreasing chances of re-identification.
All in all, big data is inescapable. It is an important aspect of every marketing strategy that every marketer should utilise. Although the risks associated with big data are great, the results acquired when done correctly are just as great. So, it is important to ensure that, as a marketer, you have set the required parameters as per your needs and purposes to keep risks as minimal as possible.