PERSPECTIVE: Is Algorithmic Transparency the Next Regulatory Frontier in Data Privacy?
/by William J. Roberts, Catherine F. Intravia and Benjamin FrazziniKendrick The U.S. House of Representatives Energy and Commerce subcommittee on Digital Commerce and Consumer Protection held a hearing last month on the use of computer algorithms and their impact on consumers.[1] This was the latest in a series of recent efforts by a variety of organizations to explore and understand the ways in which computer algorithms are driving businesses’ and public agencies’ decision-making, and shaping the digital content we see online.[2]
In its simplest form, an algorithm is a mathematical formula, a series of steps for performing mathematical equations. The witness testimony and questions from the members of the Subcommittee highlighted a number of issues that businesses and government regulators are facing.
Bias and Discrimination
A variety of businesses use algorithms to make decisions, such as social media platforms determining what content to show users, and credit card companies deciding what interest rates to charge consumers. However, the algorithms may treat otherwise similarly-situated consumers differently based upon irrelevant or inappropriate criteria.[3] Examples of bias in these algorithms abound.
For example, research shows that credit card algorithms drive interest rates up for individuals who have entered marriage counseling. Advertisement algorithms have shown job advertisements in engineering to men more frequently than women.
Exploitation of Consumer Data – Hidden Databases and Machine Learning
One way in which businesses and other entities can exploit consumer information is by creating databases of consumers who exhibit certain online behaviors. For example, they can identify users who search for terms such as “sick” or “crying” as possibly being depressed and drive medication ads to them. Companies have been able to develop databases of impulse buyers or people susceptible to “vulnerability-based marketing” based on their online behavior.[5]
Further, the past few years have seen a huge growth in the use of “machine learning” algorithms.[6] The cutting edge of machine learning is the use of artificial neural networks, which are powering emerging technologies like self-driving cars and translation software. These algorithms, once set up, can function automatically. To work properly, however, they depend on the input of massive amounts of data, typically mined from consumers to “train” the algorithms.[7]
These algorithms allow companies to “draw predictions and inferences about our personal lives” from consumer data far beyond the face value of such data.[8] For example, a machine learning algorithm successfully identified the romantic partners of 55% of a group of social media users.[9] Others have successfully identified consumers’ political beliefs using data on their social media, search history, and online shopping activity.[10] In other words, online users supply the data that allows machine learning algorithms to function, and businesses can use those same algorithms to gain disturbingly accurate insights into individuals’ private lives and drive content to users “to generate (or incite) certain emotional responses.”[11] Additionally, companies like Amazon use machine learning algorithms “to push customers to higher-priced products that come from preferred partners.”[12]
Concerns in Education
In the education context, the use of algorithms to drive decision-making about students raises concerns.[13] How the algorithms will affect and drive student learning is an open question. For example, will algorithms used to identify struggling pre-med students be used to develop interventions to assist those students, or used as a tool to divert students into other programs so that educational institutions can enhance statistical averages of applicants who are accepted to medical school?
Additionally, how will a teacher’s perception of a student’s ability to succeed be affected by algorithms that can identify students as being “at-risk” before the student even sets foot in class?[14] The bias in algorithms could also affect the ability of students to access a wide variety of learning material. For example, university librarians have noted that algorithms they use to assist students with research suffer from inherent bias where searches for topics such as the LGBTQ community and Islam return results about mental illness.[15]
Transparency is also at issue. Should students and families be aware that educational institutions are basing decisions about students’ education and academic futures on algorithmic predictions? And, if students have a right to know about the use of algorithms, should they also be privy to how the specific institution’s algorithmic models work?
Finally, concern has grown over the extent to which algorithms, owned and operated by for-profit entities, may drive educational decisions better left to actual teachers.[16] Presumably, teachers are making decisions based on the students’ best interests, where algorithms owned by corporations may be making decisions to enhance the company profit.
Future Issues for Consideration
Regulation in this area may be forthcoming. Already, the European Union’s General Data Protection Regulation (GDPR), for example, gives EU residents the ability to challenge decisions made by algorithms, such as a decision by an institution as to whether to deny a credit application.[17] New York City is considering a measure to require public agencies to publish the algorithms they use to allocate public resources, such as determining how many police officers should be stationed in each of the City’s departments.[18]
In the meantime, educational institutions in particular should carefully consider issues such as:
- Are companies using software to collect student data and build databases of their information?
- Which educational software or mobile applications in use by an institution are using machine learning algorithms to decide which content to show students?
- Should institutions obtain assurances from software vendors that their applications will not discriminate against students based on students’ inclusion in a protected class, such as race or gender?
- How will the educational institution address a bias or discrimination claim based on the use of a piece of educational software or mobile application?
- Is technology usurping or improperly influencing decision-making functions better left to teachers or other staff?
While no regulatory framework currently exists, educational institutions may find they are best able to proactively address algorithmic transparency while negotiating contracts for the use of educational technology.
In negotiating contracts with educational technology vendors, for example, education institutions may want to determine what algorithms the technology is using and whether student data the vendor is gathering from students will be used to train other machine learning models. Further, educational institutions may want to consider issues of bias in the algorithms and negotiate protections against future discrimination lawsuits if the algorithms consistently treat similarly situated students differently.
Ultimately, educational institutions will need to evaluate each piece of educational technology to understand how its built-in algorithms are influencing the data it collects and the information it presents to users.
_________________________________
William Roberts is a partner in Shipman & Goodwin LLP’s Health Law Practice Group and is the Chair of the firm’s Privacy and Data Protection team. Catherine Intravia focuses her practice at the firm on intellectual property, technology and information governance matters. Benjamin FrazziniKendrick is an associate in the firm’s School Law Practice Group, providing legal advice to public schools and other institutions in civil litigation, special education, and civil rights compliance.