SPAI proposal development support

Academic Year 2023-2024

Nicole Yadon.
Project Title: “Structured approach to monitoring online discourse about race”
Abstract: Online platforms play important roles in shaping the discourse on race in America. In this project we will develop a structured ‘sentinel’ approach that leverages the community structure of social media networks together with natural language processing techniques to monitor online discourse about race on multiple social media platforms. Given increasing rates and high-profile incidents of attacks against various groups in recent years – including Jewish, Asian, and Black Americans – we are primarily interested in examining how racist content spreads via online platforms. The project will produce outcomes of interest to both academics and lay audiences. This includes a structured dataset allowing contextualization of content relating to race, with particular focus on anti-Semitic, anti-Asian, and anti-Black content. Analyses of these data will include (i) identifying what types of racist content find widespread engagement (as opposed to being confined to e.g. extremist communities), (ii) studying cross-platform pathways for dissemination of racist content, (iii) documenting influence campaigns to drive wedges between different demographic subgroups, and (iv) identifying the spread of racist content earlier in information flows with an eye towards mitigating its effect and spread. This interdisciplinary project seeks to combine cutting-edge methods with substantive expertise to understand how racialized content spreads online and potentially diffuses from extremist communities into mainstream media and/or among political elites. This is of particular importance towards understanding sociopolitical dynamics around misinformation, inter-group relations, and racial stereotypes.

Jan Pierskalla.

Project Title: “Democratization and Authoritarian Legacies: The Case of Denazification”
Abstract: In this project, we study the politics of denazification in post-WWII Germany. Drawing on individual case files for denazification tribunals, we explore what factors determined the decision to remove (or retain) former Nazi party members from public office.
Supported through the Good-to-Great initiative, a key element of the project is the use of new and innovative approaches in machine learning. First, at the data collection step, we plan to make use of AI-powered optical character recognition technology to turn millions of digitized case files that combine tabular, type-written, and hand-written information into usable data for analysis. Second, we plan to integrate text-as-data approaches in the form of structural topic models to study tribunal outcomes.

Skyler Cranmer.
Project Title: “Developing Data Infrastructure for Computer Vision Analysis of Political Speech”
Abstract: Political science is in the midst of a revolution in the availability and usability of unstructured data. Tremendous advances have already been achieved with the use of natural language processing (NLP). However, a large proportion of political information is presented in video format. In such information, the viewer sees the expression and hears the vocal intonations of the presenter and information is often accompanied by other video clips (e.g., of events). One of the best-established findings in (political) psychology is that emotional reactions, retention, and persuasion are driven more by visuals than the actual text of what is communicated. Thus, using computer vision (CV), a form of AI based on deep learning, to explore the vast landscape of political communication represents the next great frontier in the unstructured data revolution.
The proposals I will develop primarily build data infrastructure, but also tools and training, for the application of CV to an unprecedented corpus of political speech. This proposal aims to collect, process, and database the universe of televised public speech by heads of state around the world. This will create a revolutionary and unprecedented data repository accessible to any scholar interested in using CV technology, or even just making use of the text transcripts. The proposed research fits directly into the goals of the good to great proposal as it is based on (a) the automatic gathering and databasing of very large volumes of political speech video and (b) the use of CV to analyze said data. CV is a form of AI that uses deep learning to understand images, video, and audio. Even within the realm of AI, CV stands out in terms of doing something “different, visible, and more effective” as the GTG proposal highlights the need to do. Moreover, furtherance of computer vision research in political science is in support of OSU’s developing “brand” in this area.

Marcus Kurtz
Project Title: “The South African Expropriation and Restitution Database”
Abstract: The project seeks to develop a comprehensive, searchable database of all such properties in urban South Africa owned by non-whites at the time of the imposition of apartheid. There are three principal goals in doing so. In academic terms, this database will permit the direct evaluation of a fundamental assumption in development economics: that property rights insecurity reduces growth and development. The key tool here is the post-1994 legal enactment in South Africa giving victims of apartheid the legal right to reclaim properties expropriated from them by the apartheid government. But the fact that some urban non-white properties were expropriated, and others were not allows us compare similar properties (in economic potential, size, location, etc.), with respect to a series of markers of economic utilization (property value, levels of investment, taxes paid, land utilization). This comparison directly addresses the alleged costs of property-rights insecurity (as all expropriate properties held by whites are by definition insecure under the legal regime in force since the 1990s). Secondarily, creating such a comprehensive searchable, geo-located database (which will include ownership, property transactions, subdivisions, etc.), will be a fundamental tool in the hands of S. Africa citizens who do not as yet have an effective way to easily document expropriations of familial properties as much as 70 years in the past.
The project intersects with ML/AI and Data Analytic components of the Political Science department’s GTG proposal in three main ways: (1) it will require the use of ML tools to properly trace ownership records forward from the apartheid era through property subdivisions, transfers, and renaming of geographic reference points. This is a task covering millions of individual property registries; (2) ML tools will also be utilized to assess the racial composition of ownership of properties utilizing such strong markers as name, historical addresses, and where possible matching to national-ID databases (that to this day code using apartheid-era categorizations, and (3) it will directly engage OSU graduate students in developing and operating this database alongside post-docs at the University of Stellenbosch’s Economics Department. The proposal is interdisciplinary, international, and has both academic and societal goals.