AI in Legal Interpretation: Challenges and Limitations

The promise of AI in evaluating and interpreting legal terms is a tantalizing prospect for the legal sector, offering the potential to streamline complex procedures and lighten the load on the industry’s human workforce. However, recent research by Professor Jonathan Choi raises serious questions about the reliability of these AI models, exposing significant vulnerabilities that could undermine their effectiveness. His findings underscore that the value of out-of-the-box AI solutions in the realm of legal interpretation may be compromised by their susceptibility to variation in prompt phrasing, inconsistencies in output, and the potential for their post-training procedures to skew interpretations away from commonly accepted meanings.

In a world where AI’s influence continues to permeate every sector, from healthcare to finance, the legal profession is no exception. The industry’s reliance on artificial intelligence is increasingly evident, with a suite of large language models (LLMs), such as OpenAI’s ChatGPT, Thomson Reuter’s CoCounsel, Google’s Gemini, Microsoft’s Copilot, and Anthropic’s Claude, being adopted in legal departments, law firms, and tech companies. These AI models operate by predicting the next word in a sequence based on the preceding words, a procedure honed through exposure to massive data sets, often encompassing trillions of words.

However, the Achilles’ heel of these AI models, as revealed by Choi’s research, is their hypersensitivity to prompt phrasing. A slight alteration in the way a question is posed can result in dramatically different responses, undermining their reliability. Moreover, there’s a significant amount of variation in the outputs of these AI models, even when presented with the identical input. Such inconsistencies can lead to considerable confusion and disagreement, particularly when these tools are used to interpret complex legal terms.

Furthermore, Choi’s research illuminates a disturbing trend: deviations in AI models from what we consider ‘ordinary meaning’. These deviations are introduced during the post-training procedures, where the models are further refined to generate “suitable” outputs. In a bid to control for language perceived as inaccurate or prejudicial, these procedures may inadvertently veer the model’s empirical language prediction off track. This can result in the AI providing a skewed interpretation of legal terms – potentially leading to misconstrued legal advice or decision-making.

The implications of Choi’s findings are far-reaching and profound, particularly for the Am Law 100, the definitive ranking of the 100 largest law firms in the United States. If AI models that are currently in use can deliver materially different interpretations based on disconnected variables, then their value in the legal sector might be significantly limited. As such, these revelations could cause a shift in the industry’s perception and use of AI, necessitating more rigorous scrutiny and validation of these tools before they are fully integrated into legal processes.

In conclusion, while AI holds immense potential for improving efficiency and productivity in the legal sector, Choi’s research serves as a stark reminder of the technology’s limitations. It’s clear that while these AI models can process and predict language at an impressive scale, their utility in the interpretation of legal terms is markedly less impressive. As AI continues to evolve, it’s crucial that the industry remains vigilant, continually assessing the reliability and accuracy of these tools to ensure they are fit for purpose.

Read more from law.com