A New York judge recently called out an expert witness for using Microsoft's Copilot chatbot to inaccurately estimate damages in a real estate dispute that partly depended on an accurate assessment of damages to win.
In an order Thursday, judge Jonathan Schopf warned that "due to the nature of the rapid evolution of artificial intelligence and its inherent reliability issues" that any use of AI should be disclosed before testimony or evidence is admitted in court. Admitting that the court "has no objective understanding as to how Copilot works," Schopf suggested that the legal system could be disrupted if experts started overly relying on chatbots en masse.
His warning came after an expert witness, Charles Ranson, dubiously used Copilot to cross-check calculations in a dispute over a $485,000 rental property in the Bahamas that had been included in a trust for a deceased man's son. The court was being asked to assess if the executrix and trustee—the deceased man's sister—breached her fiduciary duties by delaying the sale of the property while admittedly using it for personal vacations.
To win, the surviving son had to prove that his aunt breached her duties by retaining the property, that her vacations there were a form of self-dealing, and that he suffered damages from her alleged misuse of the property.
It was up to Ranson to figure out how much would be owed to the son had the aunt sold the property in 2008 compared to the actual sale price in 2022. But Ranson, an expert in trust and estate litigation, "had no relevant real estate expertise," Schopf said, finding that Ranson's testimony was "entirely speculative" and failed to consider obvious facts, such as the pandemic's impact on rental prices or trust expenses like real estate taxes.
Seemingly because Ranson didn't have the relevant experience in real estate, he turned to Copilot to fill in the blanks and crunch the numbers. The move surprised Internet law expert Eric Goldman, who told Ars that "lawyers retain expert witnesses for their specialized expertise, and it doesn't make any sense for an expert witness to essentially outsource that expertise to generative AI."
"If the expert witness is simply asking a chatbot for a computation, then the lawyers could make that same request directly without relying on the expert witness (and paying the expert's substantial fees)," Goldman suggested.
Perhaps the son's legal team wasn't aware of how big a role Copilot played. Schopf noted that Ranson couldn't recall what prompts he used to arrive at his damages estimate. The expert witness also couldn't recall any sources for the information he took from the chatbot and admitted that he lacked a basic understanding of how Copilot "works or how it arrives at a given output."
Ars could not immediately reach Ranson for comment. But in Schopf's order, the judge wrote that Ranson defended using Copilot as a common practice for expert witnesses like him today.
"Ranson was adamant in his testimony that the use of Copilot or other artificial intelligence tools, for drafting expert reports is generally accepted in the field of fiduciary services and represents the future of analysis of fiduciary decisions; however, he could not name any publications regarding its use or any other sources to confirm that it is a generally accepted methodology," Schopf wrote.
Goldman noted that Ranson relying on Copilot for "what was essentially a numerical computation was especially puzzling because of generative AI's known hallucinatory tendencies, which makes numerical computations untrustworthy."
Because Ranson was so bad at explaining how Copilot works, Schopf took the extra time to actually try to use Copilot to generate the estimates that Ranson got—and he could not.
Each time, the court entered the same query into Copilot—"Can you calculate the value of $250,000 invested in the Vanguard Balanced Index Fund from December 31, 2004 through January 31, 2021?"—and each time Copilot generated a slightly different answer.
This "calls into question the reliability and accuracy of Copilot to generate evidence to be relied upon in a court proceeding," Schopf wrote.