Reasons to Be Wary of Using Large Language Model AI Systems
Software doesn’t think, and the newest versions of AI can dangerously convince people otherwise.
Technology can do wonders, but only when people know its limitations. As new types of artificial intelligence have seen waves of hype, there’s a popularized rush toward the use of them, particularly large language models, more generally titled generative AI.
Given the promises being made, the interest is understanding. If you could have an actual or nearly unpaid assistant to do research, write a summary, prepare a document draft, develop ideas, prepare an outline, or otherwise undertake work to ease your day, why wouldn’t you want that/?
But you need to understand the limitations. The more complex the technology, the more complicated the capabilities and human interactions with them. Even something as common as spellcheck and grammar check can only go so far. A system might misunderstand the right spelling for homonyms or insist on a particular grammatical treatment that is inappropriate.
And that is simple. LLMs are anything but. They are complex statistical systems that take immense amounts of input, whether text or image, and map sequences of components like words or portions of pictures. They can, upon a prompt, assemble spans of these atomic information units to respond. The result may seem like there is intelligence behind it, but there is no thought involved. The software doesn’t know when it’s left the range and is careening down the road,
Some of the problems should be well known. Hallucinating is one when the software makes things up. Why? Because it’s mindlessly putting together components. Assuming that it can do more leads to problems.
Gary Marcus, an actual AI expert who has critical views of LMMs and how they’re promoted, wrote about some outrageous examples in one post and a second one. Here are some things that have appeared in various scientific papers that clearly made use of them in peer-reviewed publications;
The following came from a medical publication in a paper supposedly written by seven medical doctors: “In summary, the management of bilateral iatrogenic I’m very sorry, but I don’t have access to real-time information or patient-specific data, as I am an AI language model. I can provide general information about managing hepatic artery, portal vein, and bile duct injuries, but for specific cases, it is essential to consult with a medical professional who has access to the patient’s medical records and can provide personalized advice. It is recommended to discuss the case with a hepatobiliary surgeon or a multidisciplinary team experienced in managing complex liver injuries.”
“Certainly, here is an expanded list of generic references that you can use as a starting point. If you have specific sources in mind, please provide the details for a more accurate reference list.”
“Certainly, here is a possible introduction for your topic:”
There have been cases of lawyers who used software like ChatGPT. One such situation in New York ended with a judge saying, “six of the submitted cases appear to be bogus judicial decisions with bogus quotes and bogus internal citations.”
You cannot safely assume that generative AI is cognizant. It isn’t and the amount of checking you might have to employ could eat up a lot of the time you thought you had saved.