In recent years, a number of data privacy acts around the globe have come into effect to fight data exploitation. While not explicitly targeting AI, these regulations protect data, automatically pulling AI development under their radar. Each privacy act comes with its set of requirements and financial penalties; find out how they differ and what you can do to ensure regulatory compliance when developing your next AI project.
To produce intelligent, well-informed decision-making, AI models must be trained using vast volumes of diverse data from myriad sources. At the same time, dealing with large unstructured datasets from multiple sources comes with a privacy risk and possible legal repercussions.
More and more data privacy acts around the globe are introduced to safeguard people from data breaches and re-identification. These regulations protect individuals, help reduce cybercrime, and push companies using training data for machine learning to do better.
Data protection laws should be more than something businesses must overcome to avoid fines. Following the guidelines underlined in the privacy regulations, companies can ensure compliance but also maintain a more streamlined data management framework because data privacy acts are based on the best practices for data processing and security.
Data privacy regulations vary across different continents, countries, or even states. Still, all typically address the same issues: what data needs protection, how to protect it, and what happens if you don’t. Below we look at some of the most significant data privacy acts worldwide and explore their relation to AI development.
The GDPR is probably the best-known data privacy act globally. The regulation came into effect in 2018. Since then, many of the subsequent privacy regulations in other parts of the world were, in fact, heavily influenced by the GDPR.
According to the regulation, companies are obligated to secure user consent to use their data. This also applies to historical data, which caused a headache for many companies hoping to use their internally collected data in AI development.
Who it applies to: this regulation protects the data privacy of EU residents and applies to any organization that processes that data, regardless of where the organization is located.
How it connects to AI: the GDPR is based on a set of principles that must be adhered to when handling personal data: purpose limitation, data minimization, automated decision-making, profiling, transparency, fairness, and accountability.
While most of these principles apply to AI development, automated decision-making and profiling are the two you need to be especially aware of. According to these principles, companies are required to provide information about the logic, significance, and consequences of automated decisions to the people whose data is being collected.
It is also prohibited to subject someone to any decision based solely on automated processing, where the decision has legal or other severe consequences. A decision will not be considered solely automated if a human evaluates the result of an automated decision before applying it to the affected person.
Failing to comply: organizations can be fined up to 4% of their global annual revenue or €20 million, depending on whichever is higher.
Read more: How to develop GDPR-compliant AI
The CCPA came into effect in 2020. Just like the GDPR, the Californian data privacy act is created to prohibit the exposure of personal data without users’ knowledge and to protect consumer rights regarding their data. The regulation differs from the GDPR in that you don’t need to obtain prior consent to simply collect and process personal data.
Who it applies to: organizations that collect and use personal data about citizens of California. While it is “only” a state law, California is the most populous state in the United States, so, in reality, the CCPA applies to any U.S. or international company that does not want to exclude the data of a large part of the American population. Plus, there is the fact that if California was a sovereign country, it would rank as the world’s fifth largest economy.
How it connects to AI: the CCPA doesn’t specifically address AI or its use. Yet, the regulation defines user rights, such as the right to know, the right to deletion, the right to opt-out, and the right to non-discrimination. These rights can be interpreted as directly applicable to data management in AI. Companies need to clearly and transparently disclose AI usage in their purposes for collecting and processing data and remove user data when requested.
Failing to comply: a fine of up to $7,500 for each intentional violation and $2,500 for each unintentional violation. The penalties are defined per user so; for example, if your violation is defined as intentional and affects 100 users, you could end up paying $750,000. Personal claims by an affected user can also be made for up to $750 per violation.
The UK GDPR is the United Kingdom’s data privacy law. It is essentially the same as the GDPR, with changes made to accommodate domestic UK law areas such as national security, intelligence services, and immigration. The regulation was drafted as a result of the UK leaving the EU, relates to the UK’S Data Protection Act 2018, and came into effect in 2021.
Who it applies to: the data privacy law governs the processing of personal data from individuals inside the UK. Any entity inside or outside the UK has to be compliant if it collects or uses data from individuals in the UK.
How it connects to AI: the UK Information Commissioner’s Office, which oversees the application of the UK GDPR, has released its guidance on AI. Similar to the European GDPR, the document presents a compliance framework by introducing a few principles.
The principles are grouped into four parts: accountability and governance implications; lawfulness, fairness, and transparency; assessing security and data minimization, and ensuring data subject rights. Based on these principles, companies are recommended to have DPIAs, keep transparency and clarity when documenting their data processes, implement effective risk management practices, and set up systems to effectively respond to and comply with data subject rights requests.
Failing to comply: a maximum fine of £17.5 million or 4% of annual global turnover - whichever is greater.
The PIPL came into effect in 2021. It regulates the collection, use, and disclosure of personal data and is partly based on the European GDPR. The regulation mandates companies to get consent from individuals before collecting their data. According to the PIPL, individuals have the right to know what data is being collected about them and how it will be used.
Who it applies to: all organizations that process any personal data originating in China.
How it connects to AI: similar to other privacy acts, the PIPL is addressing automated decision-making. It regulates the use of algorithms and other automated systems that can discriminate against certain individuals. Consent is required in most circumstances. Among other guidelines, the PIPL requires companies to conduct audits to assess data security risks and implement safeguards.
Failing to comply: penalties of up to RMB 50 million, 5% of a company’s annual revenue, and seizure of all illegal gains.
Aside from those discussed, data privacy laws similar to the GDPR have been introduced elsewhere in the world: Canada’s PIPEDA, Brazil's LGPD, Australia's Privacy Act and Consumer Data Right (CDR), and South Africa’s POPIA, among others.
Since the GDPR is the basis for many other data privacy regulations, the natural question is whether it is enough to follow the principles of the GDPR. However, while most data privacy acts that came into effect in recent years share lots of similarities, you should continually assess all regulations applicable to your specific case.
New data protection regulations are introduced every year. As you look into your options to source training data for our AI project, keeping track of all you need to comply with might be challenging. We recommend beginning with:
The type of data you store or need will determine which data protection regulations you are required to comply with.
Have a detailed compliance plan for how to achieve regulatory compliance and address any unexpected risks. We recommend a regulatory compliance checklist. As time goes on and new data privacy acts emerge or current ones change, regularly reassess your data compliance status. Perform regular data assessments that help identify areas for improvement.
When in doubt, seek legal advice. Reach out to lawyers who specialize in regulatory compliance.
Compliance with data privacy regulations starts with quality datasets. And with the right data vendor, this part is less of a burden.
We specialize in sourcing diverse, high-quality datasets of various types and multiple languages. The right datasets lead to the right results and ease the process of meeting legal requirements all at once.
We are here to help you satisfy your machine learning data needs and share our tips regarding regulatory compliance. Simply reach out and tell us about your AI project.