Generating Predictions For Grants Pass v. Johnson Using Artificial Intelligence

Optimized Legal Audio is an artificial intelligence engine in its infancy that tries to hear what judges say, read the language they use, and through this to infer their relative preference for one attorney’s argument over another’s.

cartoon The Supreme Court architecture

The philosopher Archimedes once wrote “Give me a lever that is long enough and a fulcrum to place it on and I will move the world.” Is there a metaphor for the Court within this quote and if so do certain justices control a lever? In fact, we may find that the control or the center of the Court shifts this term from its point in the three previous terms since Justice Barrett joined the Court. While there is no way to be certain of such shifts in advance of opinions, oral arguments provide useful hints. Along with a collaborator, I developed a novel prediction engine, currently known as Optimized Legal Audio (OLA), based on synthesizing and then implementing specific improvements to a variety of existing text and audio-based features.

OLA is an artificial intelligence engine in its infancy that tries to hear what the judges say, read the language they use, and through this to infer their relative preference for one attorney’s argument over another’s. It then generates a vote prediction for each justice (or judge, as it is not designed to solely be used to examine Supreme Court oral arguments).

The idea that a computer can complete sentences and write poetry, take the bar exam, or create art was unimaginable until the advent of GPTs. The idea that a judge’s words, the transcript, and the oral argument audio can give a reliable estimation of the outcomes now appears not only possible, but feasible with a high degree of accuracy.

Argument of Interest

While this Supreme Court term did not begin with the same high-profile myriad of cases like in the 2021 term with Dobbs and Bruen, this term will in all likelihood be no less momentous.  With cases ranging from defining speech rights, to gun rights, the future administrative deference, and executive immunity, several of these cases will more likely than not be taught in constitutional law courses for years to come. Just last week, the Court heard arguments in three cases with potentially immense repercussions.

In the first, City of Grants Pass, Oregon v. Johnson, the Court looked at whether laws limiting camping on public property are a form of “cruel and unusual punishment.” In the second, Moyle v. U.S., the Court examined the potential for future enforcement of Idaho’s Defense of Life Act, prohibiting abortions except in instances that would save the mother’s life, in light of the Emergency Medical Treatment and Labor Act. Finally, the justices heard arguments in Trump v. U.S. where they examined the doctrine of presidential immunity relating to criminal prosecution for official acts while in office.

These cases are not likely to be resolved much before the Court ends its term in the last days of June. Between now and then folks ranging from legal pundits, academics, and lawyers (among others) will speculate about the potential outcomes in these cases with far reaching ramifications.

While there is no tried-and-true method to predict case outcomes from oral arguments, techniques are improving by the day. Not directly related to oral arguments, in 2004, several leading political scientists published a paper comparing the predictions of legal experts with a statistical model finding that the model predicted 75% of outcomes correctly compared with 59.1% from experts. Since then, things have improved, although the marginal gains are slight.  Additional insights have been derived from oral arguments ranging from looking at how frequently justices speak (Epstein, Landes and Posner (2009)), to the pitches of their voices (Sen and Dietrich (2018)).

In the words of baseball star Yogi Berra, “[i]t’s tough to make predictions, especially about the future.” The best time to predict an outcome is when you have the maximum amount of information available upon which you can base the prediction. After the completion of oral arguments, all of the information at the public’s disposal is at hand. Still the 70-75% threshold for predictive accuracy is a high bar to reach and exceed.  This article applies some readily available methods to generate predictions for Grants Pass.  The article concludes by comparing the inferences from past methods to the novel OLA method mentioned above.

What do the arguments tell us?

Oral arguments occur at a specific point in each Supreme Court case. Since they are heard after the justices receive case briefs, the justices have time before oral arguments to get a strong sense of how they may vote in a case.  There is no clear consensus about the extent with which oral arguments affect Supreme Court decision-making with some papers presenting evidence that they may play a large role and others showing that the justices may have their minds made up about their decisions prior to oral arguments.

The justices’ votes are also often at least somewhat predictable notwithstanding oral arguments, especially with how they rule on recurrent issues over time.  Still, some justices’ measurable preferences shift longitudinally more than others.

We are still left with the question of what oral arguments can tell us about how the justices may decide Grants Pass. While there are limitless ways to measure the justices’ oral argument behavior, this article incorporates four measurable dimensions: the quantity of speech, when the justices choose to speak, the sentiment of their speech, and the complexity of the language they use. OLA, our novel method, is then introduced at the end.

QDAP Word Counts

The QDAP library in R is helpful in breaking transcripts down into these dimensions and others.  A heat map of some of the word statistics from the Grants Pass arguments is below.

Petitioner’s Argument

Respondent’s Argument

The terms used in the heatmaps are defined here.

Since the shading is relative speech, the absolute measures (number of words, sentences, etc.) show more speech for the attorneys: Theane Evangelis and Kelsi Corkran than for the justices.  Focusing on the justices – Sotomayor, Kagan, Jackson, Kavanaugh, and Barrett all spoke more to the petitioner’s attorney which leads to the inference that they will vote to affirm the 9th Circuit’s decision below (holding that removing such encampments equates to cruel and unusual punishment). The four potential votes in the opposite direction (dissent) based on this measure alone are from Roberts, Gorsuch, Alito, and Thomas.  In a largescale predictive model, more than just word counts would be applied to generate an understanding of how the justices may vote.

QDAP Speech ordering

Along with speech counts, we can also look at order and extent of each justices’ speech chronologically across the arguments. In Grants Pass, the chronological order of speech look like the following.

Petitioner’s Argument

Respondent’s Argument

Here we can see that most justices controlled a single segment of the respondent’s argument while the justices tended to speak several, intermittent times during the petitioner’s turn. These graphs add nuance to the earlier graphs. They also show the ordering of speakers so we can visually see when certain justices potentially follow up on points from other justices.  They show how Sotomayor and then Kagan followed by Jackson managed the first part of the petitioner’s argument while Barrett, then Gorsuch, and last Alito were the main justice speakers during the middle of the respondent’s argument.

QDAP Speech Polarity

Another way to think about the justices’ speech is through the sentiment or valence of their speech. In QDAP this is referred to as speech polarity. The polarity of each speaker’s contribution during the argument is seen below.

Petitioner’s Argument

Respondent’s Argument

These data give us several pieces of information which allow for two main comparisons. The first comparison is within each justice and between each argument so we can tell when a justice uses more positive language. The second is between justices and within each argument so we get a relative sense of the justices’ positive and negative linguistic tone. The differences in polarity show that the liberal justices tended to be more positive during the respondent’s argument in Grants Pass along with justices Barrett and Kavanaugh.  Justices Alito, Thomas, and Gorsuch were on the negative end.  Thomas (who spoke minimally) is on the low end for the petitioner’s argument as well, but Alito and Gorsuch came across as more positive towards the petitioner. These general brushstrokes accord with the inferences from word counts.

QDAP Automated Readability Index (ARI)

Another way to think about the justices’ speech relates to the complexity of language they use. While there are multiple ways to measure the complexity of language, QDAP has a function for the Automated Readability Index (ARI) which provides an academic grade level associated with the difficulty of a text (the speaker’s words here). The ARI algorithm is based on [characters / words], and [words / sentences] and is used in the graphs below.

Petitioner’s Argument

Respondent’s Argument

While Chief Justice Roberts is on the high end for ARI scores for both sides’ arguments, Justice Jackson is at the high end for the respondent’s argument and the low end for the petitioner’s. Kagan was at the low end for the respondent’s argument and the high end for the petitioner’s.  We can see that there is less consistency across each speaker’s ARI scores (either high for one and low for the other or both high or both low).  Here, this may indicate other elements of the justices’ approach to oral arguments — either strategic or subconscious — that are not directly correlated with their potential votes.

OLA: Where do we go from here?

Putting this all together, with existing methodologies we should have at best around a 70% chance of predicting the ultimate direction of the cases above. In other words, we should get approximately 2 of the 9 votes (22% error rate) incorrect. That could mean the difference between a 5-4 and a 4-5 decision and therefore, the outcome may completely surprise us. The obvious goal is to move beyond this hurdle.

With OLA we sought to build on existing methods by combining several of the measures above (and others) and then testing them on past cases. We then took our model and applied it to the federal appeals courts with three judge panels. The new algorithm continued to outperform the 70% threshold. Next we tested it several trial court cases with a single judge and found that it continued to provide accurate predictions. We analyzed a variety of parameters and then brought them all together. This is still very much an evolving process.

OLA and Grants Pass

Based on this prediction engine, Grants Pass seems to fracture somewhat on ideological lines, but not in the typical fashion. We find that the likely majority for Grants Pass voting to affirm the 9th Circuit’s decision in the case is made up of Justices Sotomayor, Jackson, Kagan, Barrett, and Kavanaugh with the Chief Justice, and Justices Thomas, Alito, and Gorsuch in dissent.

How do we generate these predictions? An example might help. Below is a visualization from OLA and it is highlighting a specific interaction between Justice Jackson and petitioner’s attorney Theane Evangelis.

This point on the graph reflects an interaction where Justice Jackson says: “But punishment is happening. In my hypothetical, people are going to jail because they’re eating in public…Why is the Eighth Amendment not implicated?”

This is an instance where Jackson appears frustrated with Evangelis’s response and the high point on the vertical axis corresponding to this point in the argument correlates with Jackson’s intonation and language use. Based on the aggregation of this and other justice/attorney interactions, using multiple methods, and adjusting for some of the shortcomings of the previous methods, OLA aims to improve the predictive capacity of trial courts, appellate courts, and Supreme Court outcomes.

Concluding Thoughts

Oral arguments do not mark the end of each case.  The justices form initial coalitions after the arguments, but historically justices still shift votes up to around 10% of the time from their initial vote after oral arguments to their final vote on the merits.  This is due, at least in part, to the justices’ agreement or disagreement with positions in the draft of the majority or dissenting opinions. With this knowledge in hand, no prediction engine is likely to get all votes right in each case.  As we get closer to that point, marginal gains are harder and harder to come by.  We’ll see how these predictions hold up when Grants Pass and other decisions in cases argued this term are finally released.  Stay tuned. More predictions are likely to follow.


Adam Feldman runs the litigation consulting company Optimized Legal Solutions LLC. For more information write Adam at adam@feldmannet.com. Find him on X/Twitter and LinkedIn. He’s also on Threads @dradamfeldman and on Bluesky Social @dradamfeldman.bksy.social.

CRM Banner