Chambers Guidance: Using AI Large Language Models (LLMs) Wisely and Ethically

Tags

, , , , , ,

Tomorrow, I’m delivering a talk to the Texas Second Court of Appeals (Fort Worth), joined by my friend, Lynne Liberato of Houston. We will address LLM use in chambers and in support of appellate practice, where Lynne is a noted authority. I’ll distribute my 2025 primer on Practical Uses for AI and LLMs in Trial Practice, but will also offer something bespoke to the needs of appellate judges and their legal staff–something to-the-point but with cautions crafted to avoid the high profile pitfalls of lawyers who trust but don’t verify.

Courts must develop practical internal standards for the use of LLMs in chambers. These AI applications are too powerful to ignore and too powerful to use without attention given to safe use.

Chambers Guidance: Using AI Large Language Models (LLMs) Wisely and Ethically

Prepared for Second District Court of Appeals (Fort Worth)


Purpose
This document outlines recommended practices for the safe, productive, and ethical use of large language models (LLMs) like ChatGPT-4o in chambers by justices and their legal staff.


I. Core Principles

  1. Human Oversight is Essential
    LLMs may assist with writing, summarization, and idea generation, but should never replace legal reasoning, human editing, or authoritative research.
  2. Confidentiality Must Be Preserved
    Use only secure platforms. Turn off model training/sharing features (“model improvement”) in public platforms or use private/local deployments.
  3. Verification is Non-Negotiable
    Never rely on an LLM for case citations, procedural rules, or holdings without confirming them via Westlaw, Lexis, or court databases.  Every citation is suspect until verified.
  4. Transparency Within Chambers
    Staff should disclose when LLMs were used in a draft or summary, especially if content was heavily generated.  Prompt/output history should be preserved in chambers files.
  5. Judicial Independence and Public Trust
    While internal LLM use may be efficient, it must never undermine public confidence in the independence or impartiality of judicial decision-making. The use of LLMs must not give rise to a perception that core judicial functions have been outsourced to AI.

II. Suitable Uses of LLMs in Chambers

  • Drafting initial outlines of bench memos or summaries of briefs
  • Rewriting judicial prose for clarity, tone, or readability
  • Summarizing long records or extracting procedural chronologies
  • Brainstorming counterarguments or exploring alternative framings
  • Comparing argumentative strength and inconsistencies of and between parties’ briefs

Note: Use of AI output that may materially influence a decision must be identified and reviewed by the judge or supervising attorney.


III. Prohibited or Cautioned Uses

  • Do not insert any LLM-generated citation into a judicial order, opinion, or memo without independent confirmation
  • Do not input sealed or sensitive documents into unsecured platforms
  • Do not use LLMs to weigh legal precedent, assess credibility, or determine binding authority
  • Do not delegate critical judgment or reasoning tasks to the model (e.g., weighing precedent or evaluating credibility)
  • Do not rely on LLMs to generate summaries of legal holdings without human review of the supporting authority

IV. Suggested Prompts for Effective Use

These prompts may be useful when paired with careful human oversight and verification

  • “Summarize this 40-page brief into 5 bullet points, focusing on procedural history.”
  • “Summarize the uploaded transcript respecting the following points….”
  • “Summarize the key holdings and the law in this area”
  • “Rewrite this paragraph for clarity, suitable for a published opinion.”
  • “List potential counterarguments to this position in a Texas appellate context.”
  • “Explain this concept as if to a first-year law student.”

Caution: Prompts seeking legal summaries (e.g., “What is the holding of X?” or “Summarize the law on Y”) are particularly prone to error and must be treated with suspicion. Always verify output against primary legal sources.


V. Public Disclosure and Transparency

Although internal use of LLMs may not require disclosure to parties, courts must be sensitive to the risk that judicial reliance on AI—even as a drafting aid—may be scrutinized. Consider whether and what disclosure may be warranted in rare cases when LLM-generated language substantively shapes a judicial decision.

VI. Final Note

Used wisely, LLMs can save time, increase clarity, and prompt critical thought. Used blindly, they risk error, overreliance, or breach of confidentiality. The justice system demands precision; LLMs can support it—but only under a lawyer’s and judge’s careful eye and hand.


Prepared by Craig Ball and Lynne Liberato, advocating thoughtful AI use in appellate practice.

Of course, the proper arbiters of standards and practices in chambers are the justices themselves; I don’t presume to know better, save to say that any approach that bans LLMs or presupposes AI won’t be used is naive. I hope the modest suggestions above help courts develop sound practical guidance for use of LLMs by judges and staff in ways that promote justice, efficiency and public confidence.

Tailor FRE 502(d) Orders to the Case

Tags

, , , ,

Having taught Federal Rule of Evidence 502 (FRE 502) in my law classes for over a decade, I felt I had a firm grasp of its nuances. Yet recent litigation where I serve as Special Master prompted me to revisit the rule with Proustian ‘fresh eyes,’ uncovering insights I hope to share here

I’ve long run with the herd in urging lawyers to “always get a 502 order,” never underscoring important safeguards against unintended outcomes; but lately, I had the opportunity to hear from experienced trial counsel on both sides of a FRE 502 order negotiation and have gained a more nuanced view.

Enacted in 2008, FRE 502 was a means to use the federal rules (and Congress’ adoption of the same) to harmonize widely divergent outcomes vis-à-vis subject matter waiver flowing from the inadvertent disclosure of privileged information. 

That’s a mouthful, and I know many readers aren’t litigators, so let’s lay a little foundation.

Confidential communications shared in the context of special relationships are largely shielded from compulsory disclosure by what is termed “privilege.”  You certainly know of the Fifth Amendment privilege against self-incrimination, and no doubt you’ve heard (if only in crime dramas) that confidential communications between a lawyer and client for the purpose of securing legal advice are privileged.  That’s the “attorney-client privilege.” Other privileges extend to, inter alia, spousal communications, confidences shared between doctor and patient and confidences between clergy and parishioner for spiritual guidance.  None of these privileges are absolute, but that’s a topic for another day. 

Yet another privilege, called “work-product protection,” shields from disclosure an attorney’s mental impressions, conclusions, opinions, or legal theories contained in materials prepared in anticipation of litigation or for trial.  Here, we need only consider the attorney-client privilege and work-product protection because FRE 502 applies exclusively to those two privileges.

Clearly, lawyers enjoy extraordinary and expansive rights to withhold privileged information, and lawyers really, REALLY hate to mess up in ways that impair those rights. I’d venture that as much effort and money is expended seeking to guard against the disclosure of privileged material as is spent trying to isolate relevant evidence. A whole lot, at any rate.

One of the quickest ways to lose a privilege is by sharing the privileged material with someone who isn’t entitled to claim the privilege.  Did the lawyer let the friend who drove the client to the law office sit in when confidences were exchanged?  Such actions waive the privilege.  One way to lose a privilege is by accidentally letting an opponent get a look at privileged material.  That can happen in a host of prosaic ways, even just by the wrong CC on an email.   More often, it’s a consequence of a failed e-discovery process, say, a reviewer or production error.  Inadvertently producing privileged information in discovery is every litigator’s nightmare.  It happens often enough that the various states and federal circuits developed different ways of balancing protection from waiver against findings that the waiver opened the door to further disclosure in a disaster scenario called “Subject Matter Waiver.”

Continue reading

Leery Lawyer’s Guide to AI

Tags

, , , , , ,

Next month, I’m privileged to be presenting on two topics with United States District Judge Xavier Rodriguez, a dear friend who sits in the Western District of Texas (San Antonio). One of those topics is “Practical Applications for AI.” The longstanding custom for continuing legal education in Texas is that a presenter must offer “high quality written materials” to go with a talk. I’m indebted to this obligation because writing is hard work and without the need to supply original scholarship, I’d probably have produced a fraction of what I’ve published over forty years. A new topic meant a new paper, especially as I was the proponent of the topic in the planning stage–an ask borne of frustration. After two years of AI pushing everything else aside, I was frustrated by the dearth of practical guidance available to trial lawyers–particularly seasoned elders–who want to use AI but fear looking foolish…or worse. So, I took a shot at a practical primer for litigators and am reasonably pleased with the result. Download it here. For some it will be too advanced and for others too basic; but I’m hopeful it hits the sweet spot for many non-technical trial lawyers who don’t want to be left behind.

Despite high-profile instances of lawyers getting into trouble by failing to use LLMs responsibly, there’s a compelling case for using AI in your trial practice now, even if only as a timesaver in document generation and summarization—tasks where AI’s abilities are uncanny and undeniable. But HOW to get started?

The Litigation Section of the State Bar of Texas devoted the Winter 2024 issue of The Advocate magazine to Artificial Intelligence.  Every article was well-written and well-informed—several penned by close friends—but no article, not one, was practical in the sense of helping lawyers use AI in their work. That struck me as an unmet need.

As I looked around, I found no articles geared to guiding trial lawyers who want to use LLMs safely and strategically. I wanted to call the article “The Leery Lawyer’s Guide to AI,” but I knew it would be insufficiently comprehensive. Instead, I’ve sought to help readers get started by highlighting important considerations and illustrating a few applications that they can try now with minimal skill, anxiety or expense. LLMs won’t replace professional judgment, but they can frame issues, suggest language, and break down complex doctrines into plain English explanations. In truth, they can do just about anything that a mastery of facts and language can achieve.

But Know This…

LLMs are unlike any tech tool you’ve used before. Most of the digital technology in our lives is characterized by consistency: you put the same things in, and other things come out in a rigid and replicable fashion. Not so with LLMs. Ask ChatGPT the same question multiple times, and you’ll get a somewhat different answer each time. That takes getting used to. 

Additionally, there’s no single “right” way to interrogate ChatGPT to be assured of an optimal result. That is, there is no strict programming language or set of keywords calculated to achieve a goal. There are a myriad number of ways to successfully elicit information from ChatGPT, and in stark contrast to the inflexible and unforgiving tech tools of the past, the easiest way to get the results you want is to interact with ChatGPT in a natural, conversational fashion.

Continue reading

Safety First: A Fun Day at the “Office”

Tags

, , , , , ,

As a forensic examiner, I’ve gathered data in locales ranging from vast, freezing data centers to the world’s largest classic car collection. Yet, wherever work has taken me, I’ve not needed special equipment or certifications beyond my forensic skills and tools.  That is, until I was engaged to inspect and acquire a Voyage Data Recorder aboard a drilling vessel operating in the Gulf of Mexico.

A Voyage Data Recorder (VDR) is the marine counterpart of the Black Box event recorder in an airliner.  It’s a computer like any other, but hardened and specialized.  Components are designed to survive a catastrophic event and tell the story of what transpired.

Going offshore by helicopter to a rig or vessel demands more than a willingness to go.  The vessel operator required that I have a BOSIET with CAEBS certification to come aboard.  That stands for Basic Offshore Safety Induction Emergency Training with Compressed Air Emergency Breathing System.  It’s sixteen hours of training, half online and half onsite and hands on.  I suppose I was expected to balk, but I completed the course in Houston on Thursday.  Now, I’m the only BOSIET with CAEBS-certified lawyer forensic examiner I know (for all the good that’s likely to do me beyond this one engagement).  Still, it was a blast to train in a different discipline.

A BOSIET with CAEBS certification encompasses four units:

  1. Safety Induction
  2. Helicopter Safety and Escape Training (with CA-EBS) using a Modular Egress Training Simulator (METS)
  3. Sea Survival including Evacuation, TEMSPC, and Emergency First Aid
  4. Firefighting and Self Rescue Techniques
Continue reading

“There’s No Better Rule”

“Take nothing on its looks; take everything on evidence. There’s no better rule.”  I quote this line from Charles Dickens’ Great Expectations at the end of all my emails.  It’s my guiding light.  Sure, how things look matters, but how things truly are matters more.  At least it should be that way. 

I reflect on all this while listening to a webinar presented by my friends, Doug Austin and Kelly Twigger, and moderated by Brett Burney.  They discussed so-called Modern Attachments or, as they prefer to call them, Hyperlinked Files.  In a nutshell, Modern Attachments (as Microsoft calls them) are files that are stored in the Cloud and accessed by links transmitted within an email as distinguished from documents embedded within the transmitting e-mail and thus traveling with the e-mail rather than being retrieved by the recipient clicking on a hyperlink.  The debate about the extent of the duty to preserve, collect and produce these modern attachments rages on, and I don’t post here to rehash that back-and-forth.  My purpose is to tackle some misinformation advanced as a basis to exclude modern attachments from the reach of discovery.

Many who paint dealing with Modern Attachments as infeasible or fraught with risk posit that Modern Attachments tend to be collaborative documents or documents that have gone through edits after transmittal.  They argue that they shouldn’t have to produce Modern Attachments due to uncertainty over whether the document collected during discovery differs significantly from how it existed at the time of transmittal. I don’t think that a good argument against collection and review, but once more, not my point here.

My point is that we need to stop asserting that these Modern Attachments are routinely altered after transmittal without evidence of the incidence of alterationWe should never guess at what we can readily measure.

Based on my experience, most modern attachments (e.g., 85-95%) are not altered after transmittal. Nevertheless, my personal observations mean little in the face of solid data revealing the percentage of Modern Attachments altered after transmittal.  We can measure this.  The last modified dates of Modern Attachments can be compared to their transmittal dates, either en masse or through appropriate sampling.  This will allow us to know the incidence of post-transmittal alteration based on hard evidence rather than assumptions or intuition.  I expect the incidence will vary between disciplines and corporate cultures, but that, too, is worth measuring.

Why hasn’t this been done?  A suspicious mind would conclude that those holding the data–who also happen to be the ones resisting the obligation to produce Modern Attachments–don’t want to know the metrics. Less archly, maybe they simply haven’t taken the time to measure, as guessing is easier. As they say in The Man Who Shot Liberty Valance, “This is the West, sir. When the legend becomes fact, print the legend.”  Facts are inconvenient. They’re sticking with the legend.

But it’s time to quit that.  It’s time to take everything on evidence.

Adapting Requests for Production for AI GLLM Assessment

The integration of Generative Large Language Models (GLLMs) into the discovery process is transforming how documents are reviewed for relevance and responsiveness. These AI models, which excel at processing large document collections, offer significant efficiency improvements. However, to harness their full potential, requests for production (RFPs) must evolve to reflect the unique capabilities and limitations of AI systems. Traditional language in RFPs, which relies heavily on human intuition, needs to be adjusted to accommodate AI’s reliance on clear instructions, context, and precision. This post explores how to adapt requests for production to optimize GLLM usage, addressing both business disputes and tort claims. I’ll provide examples to illustrate how common RFPs can be refined for AI-assisted document review.

Effective AI Prompts in Discovery

AI systems like GLLMs function best with well-structured prompts. In the context of discovery, this means adjusting RFPs to emphasize clarity, specificity, and relevance. The key elements for constructing effective AI prompts in legal discovery are:

  1. Clarity and Specificity: Ambiguity can cause AI systems to miss important documents or misclassify irrelevant ones. Specific requests guide the AI more effectively.
  2. Contextual Guidance: AI relies on context to assess relevance. Providing additional background or specifying the purpose of certain requests helps refine the search.
  3. Keyword Precision: GLLMs rely on keywords to understand and evaluate document content. Choosing precise terms helps reduce the retrieval of irrelevant documents.
  4. Examples: AI systems can better identify relevant documents if examples are provided within the RFP, as they offer patterns for the system to follow.

Incorporating these principles into RFPs ensures that AI models can make the most accurate assessments during document review.

Adapting Requests for Production: Business Dispute and Tort Claim Examples

Both business disputes and tort claims involve unique types of documents and keywords. Adapting RFPs to suit GLLM’s capabilities involves providing detailed instructions for both.

Continue reading

AI Drawing Programs: ChatGPT Versus PlaygroundAI

Of late, I’ve come to use AI generated imagery in lieu of my own work or open-source and licensed works as a source for digital storytelling.  My friend, blogger Doug Austin (perhaps the hardest working man in e-discovery) has been illustrating his daily blog posts with AI-generated art for quite some time and I kid him now-and-then about the abundance of robots in his illustrations.  I feel besieged by robot imagery.

Last week in San Antonio, I co-presented on AI evidence at a huge annual conclave of family law practitioners.  I’ve been using ChatGPT, Dall-E and Midjourney for image generation, but preparing the new presentation, I kicked the tires of several tools I’d not used before.  One of them impressed me so I thought I’d post to share it.  I think it blows the doors off ChatGPT’s images.  It’s called PlaygroundAI  It lets users create up to 50 images per day at no charge, and up to 1,000 images a day on its Pro plan ($15/month paid monthly, cancel any time).

Continue reading

Doveryai, No Proveryai!

I recently published an AI prompt to run against search terms then get the AI to propose improvements.  Among the pitfalls I’d hoped to expose was the presence of “stop” or “noise” words; terms routinely excluded from search indices.  Searches incorporating stop words fail because terms not in the index won’t be found.  Ensuring your searches don’t include stop words is an essential step in framing effective queries.

To help the AI recognize stop words, the prompt included a list of default stop words for well-known eDiscovery tools.  That is, I thought I’d done that, but what I included in error (and have now replaced) was ChatGPT’s rendition of stop words for the major tools.  I’d made a mental note to check the lists supplied but—DOH!—I plugged it into the prompt and then forgot to do my due diligence.

I was feeling pretty good about the post and getting some nice feedback.  Last night, my dear friend and e-discovery Empress Mary Mack commented on the novelty of seeing the various stop word lists broken out in a ready reference.  I think echoes of Mary’s kind comment woke me at 4:00am, my subconscious screaming, “HEY DUMMY!  Did you verify those stop words?  Tell me you didn’t blindly trust an AI?!?”

So, long before sunrise, I was manually checking each stop word list against product websites and—lo and behold—every list was off: some merely incomplete but others not even close. ChatGPT hallucinated the lists, and I failed to do the crucial thing lawyers must do when using AI as a research assistant: Trust but verify.

No harm done, but I share my chagrin here to underscore that you just cannot trust an AI generative large language model to do your research without careful human assessment of the output.  I know this and let it slip my mind.  Last time for that.  I’ve corrected the prompt on my blog and hope I’ve gotten it right.  I post this to remind my readers that AI LLMs are great—USE THEM–but they are no substitute for you.  Doveryai, no proveryai!

AI Prompt to Improve Keyword Search

Twenty years ago, I dreamed up a website where you would submit a list of eDiscovery keywords and queries and the site would critique the searches and suggest improvements to make them more efficient and effective. It would flag stop words, propose alternate spellings, and alert the user to pitfalls making searches less effective or noisy. I even envisioned it testing queries against a benign dataset to identify overly broad terms and false hits.

I believed this tool would be invaluable for helping lawyers enhance their search skills and achieve greater efficiency. Over the years, I tried to bring this idea to life, seeking proposals from offshore developers and pitching it to e-discovery software publishers as a value-add. In the end, a pipe dream. Even now, nothing like it exists.

The emergence of AI-powered Large Language Models like ChatGPT made me think what I’d hoped to bring to life years ago might finally be feasible. I wondered if I could create a prompt for ChatGPT that would achieve much of what I envisioned. So, I dedicated a sunny Sunday morning to playing “prompt engineer,” a whole cloth term for those who craft AI prompts to achieve desired outcomes.

The result was promising, a significant step forward for lawyers who struggle with search queries without understanding why some fail. Most search errors I encounter aren’t subtle. I’ve written about ways to improve lexical search, and the techniques aren’t rocket science, though they require some familiarity with how electronically stored information is indexed and how search syntaxes differ across platforms. Okay, maybe a little rocket science. But if you’re using a tool for critical tasks, shouldn’t you know what it can and cannot do?

Some believe refining keywords and queries is a waste of time, casting keyword search as obsolete. Perhaps on your planet, Klaatu, but here on Earth, lawyers continue using keywords with reckless abandon. I’m not defending that but neither will I ignore lawyers’ penchant for lexical search. Until the cost, reliability, and replicability of AI-enabled discovery improve, keywords will remain a tool for sifting through large datasets. However, we can use AI LLMs right now to enhance the performance and efficiency of shopworn approaches.

Continue reading

Yes, AI is Here. No, You’re Not Gone.

Yesterday, I sought to defend the value of my law school course on E-Discovery & Digital Evidence to a law Dean who readily conceded that she didn’t know what e-discovery was or why it would be an important thing for lawyers to understand.  It was a bracing experience.

My métier has always been litigation, to the point that everyone I work with sits in and around trial practice.  My close colleagues recognize that 90% of what trial lawyers do is geared to discovery and motion practice, and much of that motion practice is prompted by discovery disputes. So, hearing how a tax lawyer and academic viewed litigation was eye-opening, and troubling to the extent it impacts what’s taught to new lawyers.

Do you agree about the centrality of discovery to litigation, Dear Reader?

The Dean shared her sense that discovery is being replaced by AI and that “soon AI will handle the production of relevant information instead of lawyers.”  I replied that I expected the review phase to be abetted or supplanted by AI in the near term—that’s here—but it would be some time before all the tasks that come before review would be fully AI-enabled.

The idea that there are crucial tasks requiring lawyer intervention before review was surprising to her.  For those who don’t manage electronic discovery day-to-day, electronically stored information seems to magically appear in review tools.  But for e-discovery folks, the march through identification, preservation, collection and processing is our path, and we know that no one, and no AI, can undertake an assessment of the evidence without facing the data.

You’ve got to face the evidence to assess the evidence.

That’s axiomatic; but it’s downplayed by those shouting “AI! AI!”  As they say in these parts, “you’ve got to put the hay down where the goats can get it.”  Until AI is embedded in everything, until AI faces the data in every phone, cloud repository, storage medium and database in ways that support discovery, the goats can’t get to the hay.

The evidence in our cases is not a “collection” until it’s collected.  That doesn’t necessarily mean a copy must be made to isolate data of interest, but that remains the prevailing way that a discrete assemblage of potentially responsive ESI is marshaled before it is processed for search and review.  Not until that occurs does the evidence face human or AI review.

Continue reading