Dueling Dictionaries and Clashing Corpora

  • Date:
  • July 06, 2021

Textualism is more popular than ever. Not “popular” in the sense of being liked or approved; modern commentators lambast textualism as flawed and even “bogus.” Textualism is “popular” in the sense of focusing on the public. Judges increasingly commit to interpret statutory and constitutional language empirically, in line with its “ordinary” and “public” meaning. To commit to the original public meaning of the Second Amendment is to hold that interpretation is constrained by original Americans’ understanding of that language.

How does one find public meaning? Many textualists look to dictionary definitions. Some are skeptical of that choice. Differing definitions allow judges to go “dictionary-shopping,” and empirical studies suggest that judges’ dictionary use is often “ad hoc and subjective.”

Textualism aims to constrain legal interpretation, limiting judicial discretion. But textualist judges can pick and choose text to analyze, pick and choose dictionaries, and pick and choose definitions. Like Llewllyn’s dueling canons, “dueling dictionaries” show that a mere commitment to “text” does not guarantee constraint.

Enter “legal corpus linguistics.” This approach (“LCL,” for short) offers an exciting new tool for textualists and other theorists committed to “public meaning.” Simply put, corpora are samples of actual language-use. To learn about the “public meaning” of a term like “bear arms,” interpreters might look not just to a few dictionaries, but to hundreds of uses of the phrase in a corpus. For example, a search of the Corpus of Founding Era American English revealed 281 instances of the phrase “bear arms.” “[O]nly a handful don’t refer to war, soldiering or organized armed action,” suggesting to some that “the natural meaning of ‘bear arms’ in the framers’ day was military.”

The LCL approach has faced some criticism. Yet, over the past decade, scholars and judges have adopted corpus linguistic tools to address questions about “public meaning.” This trend is sharp in the past five years, with citation to corpus linguistics from several state and federal courts. Scholars have advanced new corpus linguistics arguments about constitutional language including “commerce,” and in 2018 Justice Thomas cited corpus linguistics evidence about the meaning of “search” at the U.S. Supreme Court.

Thus far, LCL has been discussed more frequently (and favorably) by Republican-appointed judges than by Democratic-appointed ones. This all makes the topic of “Corpus Linguistics and the Second Amendment” particularly intriguing. Legal corpus linguistics scholarship challenges the conclusions of Heller, finding that, “the Supreme Court’s reasoning may be flawed.” The Second Amendment is an alluring test case: Can legal corpus linguistics attain textualism’s promise of objectivity, and will commentators persuaded by corpus linguistics evidence of “commerce” and “search” be similarly moved by evidence about “bear arms”?

If not, this would fuel textualism’s critics, further demonstrating that the mere commitment to “text” does not guarantee constraint, and nor would the mere commitment to “legal corpus linguistics.”

In Jones v. Bonta (concerning a 2nd Amendment challenge to California’s ban on firearm purchases by those between age 18 and 21), the 9th Circuit recently ordered supplemental briefing concerning legal corpus linguistics and the Second Amendment. The plaintiff-appellants and defendant-appellees each conducted corpus linguistic analyses, coming to radically different conclusions. Jones v. Bonta’s dueling corpus linguistics portends a new time of “clashing corpora.” Like judicial use of dictionaries, judicial use of corpus linguistics admits of interpretive choice and flexibility. Judges and advocates have flexibility in terms of which selection from the legal text to analyze, which corpus or corpora to search, which search(es) to conduct, and what conclusions to draw from the results returned from the corpus.

The phenomenon of “dueling dictionaries” is well-known. But let me conclude with some of the emerging “moves” of legal corpus linguistic argumentation (in the style of Llewellyn’s dueling canons).








1. The corpus data supports that the term ordinarily reflects this meaning; so this is its public meaning. 1. The term is a legal term and should be given its legal meaning, that meaning.
2. The corpus reveals that the term is always used in this sense; this is its public meaning. 2. A corpus is not exhaustive of ordinary understanding; the meaning might not be this sense.
3. The corpus reveals that the term was never used in that sense; that cannot possibly be its meaning. 3. See Parry 2. Absent evidence is not evidence of absence. The term’s meaning might still be that sense.
4. The corpus reveals that, generally, the term is (most) frequently used in this sense; this is its meaning. 4. Given the full context of the legal text, the term takes that sense.
5. The corpus reveals that, in the relevant context, the term is (most) frequently used in this sense; this is its meaning. 5. The “context” shared by the examples of language-use in the corpus is not adequately similar to that of the statutory context.
6. The corpus shows that this is at least a possible sense of the term, a candidate for its ordinary meaning. 6. Some language-use is metaphorical, sarcastic, or otherwise inapt as evidence of public meaning; this is not be a possible meaning in the legal text.
7. The corpus shows that a term often appears with “this” and rarely with “that”; thus, this is more informative than that of the term’s public meaning. 7. Co-location frequency of “this” over “that” does not always imply that this is more central to the term’s meaning (e.g. “black” appears more frequently than “white” before “sheep”).
8. The corpus provides evidence about the meaning of multi-word expressions by providing evidence about the meaning of each individual word. 8. Meanings of expressions are not always the simple sum of their parts.
9. The corpus provides evidence about the meaning of sentences by providing evidence about the meaning of each word and expression in that sentence. 9. Meanings of sentences are not always the simple sum of their parts.
10. Corpus evidence about “this” is not evidence of public meaning, where the corpus over-represents elite writers, and thus elite meaning. 10. Without good reason to think elite writings diverge relevantly from non-elite ones with respect to “this,” corpus evidence from the former provides evidence about public meaning of “this.”

To enumerate these clashing arguments is not to endorse them. It is simply to question the LCL (and textualist) claim that the introduction of objective empirical methods will easily constrain legal interpretation. Legal corpus linguistics is not necessarily worse than dictionaries, canons, intuition, or other textualist tools. But if textualists can freely leverage any of these arguments, it is unclear how constraining legal corpus linguistics may be.

* * *

I began by noting that textualism is more popular than ever. Legal corpus linguistics is also increasingly popular. Not “popular” in the sense of receiving broad approval. And not “popular” in the sense of relating to the ordinary public; the historical corpora relied upon vastly overrepresent elites’ language. But legal corpus linguistics is increasingly encountered—cited by textualists as relevant to the public meaning of “commerce,” “search,” and maybe even “bear arms.” In this third sense, at least, it is undeniably popular.

Will the fact that legal corpus linguistics may admit of “clashing” arguments sap its popularity? To the contrary, perhaps “clashing corpora” will share the fate of the “dueling canons” and “dueling dictionaries.” It’s been seventy years since Llewellyn noted the “dueling canons” and over twenty years since the observation of “dueling dictionaries.” Today’s Supreme Court regularly relies on both tools.