OpenAI's Gabriel Uses GPT as Social Measurement Tool

OpenAI’s Gabriel Uses GPT as Social Measurement Tool

February 16, 2026 03:15 AM IST | Written by Rutvik Sappadla | Edited by Vaibhav Jha

Social Science subjects such as Economics, Anthropology, Sociology and Political Science, that were once studied with normative approach, are now subject to research on their quantitative aspects. While its difficult to measure social science concepts on a scale, the OpenAI’s GABRIEL model might act as the bridge bringing the two worlds together.

The Economic Research Team at OpenAI has released an open source toolkit-GABRIEL (the Generalized Attribute-Based Ratings Information Extraction Library) that uses GPT to turn unstructured text and images into quantitative measurements. The qualitative nature of the material (being the data) that these subjects deal with are rich but unstructured making them unsuitable to test hypotheses statistically. Using quantitative data, though rigorous, does not capture the nuances.

What is GABRIEL toolkit by OpenAI?

GABRIEL is a python library that leverages the comprehension capabilities of LLMs. The ability of LLMs to richly understand qualitative data is used to label and measure quantitative attributes on that data. It is important to note that GABRIEL is not a new Machine Learning algorithm but more of a prompt wrapper around an LLM(ChatGPT). Any LLM can be used to label and quantitatively measure attributes , but GABRIEL makes ChatGPT’s intelligence usable at scale by ‘packaging’ code in a straightforward set . Scalability, validity, consistency and accessibility are what distinguish GABRIEL.

In the research paper that introduces GABRIEL, GPT as a ‘measurement tool’ ,the researchers found that GPT was “accurate across domains and generally indistinguishable from human evaluators”. They have demonstrated how GABRIEL functions using examples , in one of which they have used GABRIEL to evaluate “How toxic is the conversation online?” . GABRIEL assesses statements throughout the corpus and performs measurement at scale and rates each statement on a scale of 0-100, where 0 denotes absence of toxicity and 100 denotes full expression of.

The paper in later sections also provides evidence supporting the stability of the measurements yielded by GPT by validating them against hundreds of human labeled datasets. It is also a tool that researchers have found to be consistently good enough for credible measurement and very accessible – even to non programmers. Also highly scalable , corpora with tens of thousands of items are finished in minutes.

GABRIEL optimizes GPT for a very specific purpose and transforms it into an incredibly powerful research instrument empowering students and researchers in the Social Sciences to analyze data at an unprecedented scale and possibly answer questions never attempted before.

Authors

Rutvik Sappadla

Rutvik is a freelance technology writer with a background in computer science. He graduated in 2022, after which he spent time working in the IT industry—an experience that informs his approach to writing on technology. He has a keen interest in AI and emerging technologies, particularly how they translate into real-world use and their broader social impact. Through his work, he aims to break down complex ideas, making technology more accessible to a general audience.

Vaibhav Jha

Vaibhav Jha is an Editor and Co-founder of AI FrontPage. In his decade long career in journalism, Vaibhav has reported for publications including The Indian Express, Hindustan Times, and The New York Times, covering the intersection of technology, policy, and society. Outside work, he’s usually trying to persuade people to watch Anurag Kashyap films.