The Analyst’s Dilemma: Artificial Precision or Content Specialization
This is the first installment of a series highlighting exceptional student contributions to the Global Research Institute. Stay tuned for other features throughout April: Undergraduate Research Month. NukeLab is an undergraduate research lab at GRI that applies cutting-edge social science theory and methods to pressing policy questions in nuclear security, proliferation, and deterrence.
By Lucas Arnett ’22 NukeLab Research Assistant
When NukeLab first launched, there were three projects that worked side-by-side — two based on data science research and one based on historiography. My background is in history and political science, so I started on the historiography team and figured I would learn from my data-driven colleagues about how the two approaches complement each other in a threat assessment environment. I did eventually get to try both, but the lessons I learned were not those I had anticipated.
Writing a historical case study is similar to writing a background memo, so at first the process felt familiar: find relevant sources, re-create the timeline, identify continuities and changes over time, and summarize proliferation incentives for each period. In a lab, however, there are multiple people writing the same case study based on the same set of sources, so the footnotes and writing have to be in a format others can understand, and each research assistant needs to know the material well enough to reconcile differences of interpretation between analysts. We soon found that verbally presenting our findings was inefficient, especially over Zoom, so we decided to each write our own interpretations as papers and then compare — engaging in a sort of “peer-review” process.
Our team soon realized that we all identified similar proliferation incentives, but that we arrived at different conclusions about which reasons were the most salient, and tended to emphasize different evidence. Though frustrating at first, I loved that these debates reflected the ongoing debates between historians and political scientists we studied in class, and that now I had questions of my own to contribute. Is it dishonest to code a country as “wanting” a program for a specific reason when the primary actors in said country are divided? Should we be trying to incorporate confidence levels into our coding judgements? If so, how? I saw for myself how difficult it is to reach consensus even on a seemingly narrow historical point, a necessary part of coding for a dataset, and came to recognize the artificial certainty we must impose on our data to produce results. Still, there are far more cases of attempted nuclear proliferation than an analyst could feasibly read exhaustively, so analysts cannot just throw our hands up instead.
Now extra eager to try a quantitative approach, I reviewed the quantitative proliferation literature for myself and discovered an unanswered puzzle. Political scientists in the proliferation literature typically treat variables ‘monadically’ — that is, as qualities of a given country — because the goal is to predict a specific country’s behavior. I find that monadic setups can only track the predictive value of whether a country is in a rivalry, not with whom a country is in a rivalry, because tracking specific partners would require thousands of these attributes (one for each country pair). I thought it made more sense to introduce a dyadic setup that measures whether specific country pairs become more likely to proliferate due to specific elements of their relationship, such as power differential or proximity. NukeLab Director Dr. Jeff Kaplow pointed me to a few studies that use a dyadic setup in the conflict literature, and I set about studying it for myself.
Over the next few semesters, some colleagues and I went through the full research process: writing a literature review, manipulating data, developing a theory, building a regression, interpreting results, and drafting a paper. We enjoyed the rush of running code, the morbid anxiety of receiving an error message, and the satisfying relief of mixed positive results, but we were surprised to see how much more difficult our models were to draw lessons from compared to the case studies. Yes, our results show that pairs with large power differentials are more likely than other pairs to experience proliferation, but what do those conclusions mean for the monadic setup, and how can we explain those results theoretically? The results were more abstract than the case study because we had considered every country pair at once, and it was much more difficult to point to one piece of evidence or another to justify our interpretations.
Personally, I concluded that data science is great for helping analysts decide which subject areas to focus their attention on, especially given the large number of cases, but it will be a long time before machines replace analysts; quantitative findings are just not specific enough. Time spent building nuclear fuel cycle expertise, familiarizing ourselves with assessment processes, practicing open-source analysis, and learning country-specific histories for a case study is not time wasted in an analytic career. At the very least, analysts need enough content knowledge to apply the models’ findings to the policy environment, to identify when the models need correction, and to appropriately incorporate their findings into the assessed confidence level, all of which can only be gained from reading history. Still, the two are not separate — silo thinking is not making anyone safer, and any opportunity students must learn new skills in either field is an opportunity well-spent.
No comments.
Comments are currently closed. Comments are closed on all posts older than one year, and for those in our archive.