Because data has value and many research actors - such as universities and other research teams - are interested in having access to sufficiently high quality and well-controlled data, it would often be possible to make data containing personal data available to different parties. One easy and logical factor to classify data is whether it contains personal data, ie whether or not there is a need to be concerned about matters covered by the European Union's general data protection regulation. Often, many data sets that are valuable to the company itself and to the research contain personal information that restricts data utilization through regulation, and then a brief lesson on what GDPR means and how to share data, for example by pseudonymizing or anonymizing data.
The General Data Protection Regulation (GDPR) is a data protection regulation enacted in the spring of 2018 in the EU, according to which an individual has several different rights to decide on the data to be collected. If, for example, information about a person is collected on a website (e.g. during the registration), the person has the right to inspect, correct and delete his or her personal data and to object to and restrict the processing of his or her own personal data. In addition to this, you also have the right to request the transfer of your own personal data from one controller to another. In practice, this means that without the obligations set out in the above regulations, you will not be able to collect personal data even from the staff of your own company. Therefore, everyone who deals with your company must ask permission to collect the data, tell them what the data will be used for and also allow the data to be deleted.
From this point of view, if you start to think about how valuable data that contains personal data can be made available to other stakeholders, a challenging situation lies ahead on a practical level. For example, when collecting personal data, it is possible to ask individuals for permission to share their data with a particular research institute. This makes it possible to share data once you have permission, but how do you proceed when requests for verification or deletion of data for some reason come in, how do you ensure that all parties are working properly in these situations, and how does tens of hours of manual work occur for each request? One option is, of course, technical implementations and interfaces through which these things can be technically automated and legally contracted for different use cases between companies. While this is possible, it should be noted that thereafter, the data must be of tremendous value in order for these solutions and contractual costs to be met in a financially viable manner.
Then there are much lighter solutions where personal data is removed from the dataset so that the relevant information remains the same. One of the options is to pseudonymize and the other is to anonymize the data. Pseudonymization refers to the act of directing information to someone using someone else's information. If a model like this is to store personal information as part of a data set, care must be taken to ensure that that connecting factor is not available to other parties. This will ensure that people other than you do not know who the data is targeting and therefore the GDPR settings will only be handled by you. In many cases, an encryption key is used as a practical example of pseudonymisation, which allows personal data to be encrypted in an unrecognizable form, and without this connecting factor, i.e. the encryption key, the data cannot be restored to its original form.
The second option, anomyzation, is very similar to pseudonymization but without that return option. In other words, personal data can be virtually deleted from the data set to be shared from other data or changed to another format (although aggregated) without the possibility of retrieval, leaving all data to be shared in the form allowed by the settings without requiring permission from users. With regard to both pseudonymization and anomicization, great care must be taken and the person doing this task must have a clear understanding of what is being done. If you intend to share information that contains, for example, a personal security number and a street address in the same dataset and only delete the personal identity number, will other information be obtained by combining from whom the data at that street address was collected? Yes, they will, that is when you shared the data under the GDPR even if you deleted some of that data.
So there is some possibilities for sharing GDRP-related information, but it needs to be done right and very careful. It is by no means intended to intimidate you but to encourage you to do so properly, as data is of immense value!