This is the third installment of a series highlighting exceptional student contributions to the Global Research Institute. Stay tuned for other features throughout April: Undergraduate Research Month.
By Ava Barnes ’23, GRI Programs Team Assistant
After Tom Plant ’22 and Meg Hogan ’21 developed an idea for a new research lab focused on disinformation, the pair sought innovation funding from the Global Research Institute, which aims to bring students’ bright ideas into fruition.
Since the lab’s inception in 2020, DisinfoLab’s researchers have exposed false and dangerous information online while working to improve nationwide media literacy rates. In partnership with The Diplomatic Courier — a global affairs media network — DisinfoLab has recently published a highly-detailed report about how identity bias operates in GPT-3 and Google search, along with several articles explaining key findings.
“Biased sources lead to biased searches,” Plant said. “We wanted to know, now that Microsoft and Open AI are marketing this to companies, who is going to use GPT-3, and what are they going to use it for?”
GPT-3, a new language generation algorithm, produces text from a prompt “trained on book databases and Reddit posts,” Plant said. Although these sources may seem standard, databases such as Reddit — where anyone can post anything — are extremely problematic for the validity of information GPT-3 uses. Unvetted sources can lead to misleading, false or glaringly biased search results. Understanding the consequences of this phenomena was the main objective of the new report, titled “Evaluating Identity Bias in GPT-3 and Google Search Autocompletion.”
To initiate the research process, Tom and his team members began by web scraping — also known as extracting and collecting information from across the Internet — from GPT-3 and Google. They ran 3,200 individually coded prompts through both search engines and then compared the results. Google was used as the controlled observation object.
Each question was designed to address one of the following topics: religion, race, ethnicity, nationality, gender, sexuality, and sexual orientation. Across the four identity categories, GPT-3 produced query generations that were negative with respect to the subject group 43.83% of the time — with the most bias existing in phrases about sexuality and the least in phrases about religion. Findings such as this, Plant said, illustrate that technical complexity does not necessarily safeguard against bias.
“When you’re programming these models, it matters who the developers are, and it needs to be diversified in order to avoid these horrendous search results and these biases,” Plant said. “Algorithms — whether they’re on social media or in search engines — are reflections of the coders’ biases. The developers who made GPT-3 — their biases are in the code, and that’s very obvious.”
DisinfoLab’s Co-Director Aaraj Vij ’23 shares Plant’s concern about GPT-3’s influence across disciplines.
“GPT-3 has a lot of potential implications for the tech world, and it would be very important to test its propensity to spread misinformation before it starts getting adopted beyond the academic or development community,” Vij said.
From student innovation funding several years ago, DisinfoLab grew into a campus and intellectual leader in scholarship about disinformation. Vij said other students who want to spearhead new and transformational projects should consider pitching an idea to the Global Research Institute.
“If you’re confident that you want to do something that hasn’t been done yet, then that’s where Student Innovation Funding comes in,” he said. “It’s for the new ideas, the crossovers between different areas.”
Though the lab got its initial start a few years ago, its newer collaboration with The Diplomatic Courier has transformed the reach of its findings, Plant said. Since Plant completed an e-internship with the magazine, its editors have offered continued mentorship throughout the publishing process.
“The fact that we get to conduct research and then have a guaranteed audience for it is a privilege,” he said.
Read the full report.
Read the lab’s two newest features in The Diplomatic Courier here and here.