Original Source Here
ChatGPT Doesn’t Know How to Say You Are Incorrect
Ora Kassner had doubts about the intelligence of her computer, questioning whether it truly lived up to the hype surrounding it. In 2018, Google introduced BERT, an algorithm for language modeling, which Kassner, being a researcher in the same domain, swiftly installed on her laptop. BERT marked Google’s inaugural self-taught language model, trained on an extensive corpus of online data. Among her colleagues, Kassner was impressed by BERT’s ability to complete sentences and answer simple queries, giving the impression that this large language model (LLM) possessed human-like reading capabilities, if not better.
However, Kassner, then pursuing her postgraduate studies at Ludwig Maximilian University of Munich, retained her skepticism. She believed that LLMs should possess a genuine understanding of the meaning behind their responses, including what they imply and what they do not. It was one thing for a model to comprehend that a bird could fly; Kassner argued that the model should inherently recognize the falsehood in a negated statement like “a bird cannot fly.” In 2019, together with her advisor Hinrich Schütze, Kassner decided to put BERT and two other LLMs to the test. To their dismay, they discovered that these models treated words such as “not” as if they were invisible.
Since then, LLMs have undergone substantial advancements in both size and capability.
Although chatbots have made strides in mimicking human behavior, they still struggle with negation. While they comprehend the implications of a bird’s inability to fly, they falter when confronted with more intricate logical reasoning involving words like “not” — a task that poses no challenge to a human.
Teaching a computer to read and write like a human proves to be a formidable challenge. Machines excel at storing vast amounts of data and performing complex calculations, prompting developers to construct LLMs as neural networks: statistical models that gauge the relationships between objects, in this case, words. Each linguistic association carries a weight, which, during training, is finely adjusted to represent the strength of that relationship. For instance, the connection between “rat” and “rodent” carries greater weight than “rat” and “pizza,” even if there have been instances of rats enjoying a slice of the latter.
In contrast to humans, LLMs process language by converting it into mathematical representations. This approach facilitates their text generation capabilities, as they predict likely combinations of text. However, it comes with a drawback.
It is a reasonable mistake, as Ettinger explains, for “robin” and “bird” to be strongly correlated in numerous contexts due to their high likelihood of co-occurrence. Yet, any human can readily identify the fallacy in this reasoning.
By 2023, significant progress was made by OpenAI’s ChatGPT and Google’s bot, Bard, allowing them to accurately predict that Albert’s father had handed him a shovel, not a gun. This progress can be attributed to the availability of enhanced and expanded datasets, enabling better mathematical predictions.
This prompts the question: Why don’t phrases like “do not” or “is not” simply prompt the machine to disregard the top predictions from “do” and “is”?
This failure is not a coincidence. Negations such as “not,” “never,” and “none” fall into a category known as stop words, which serve a functional rather than descriptive purpose. In contrast to words like “bird” and “rat,” which possess clear meanings, stop words do not contribute content on their own. Other examples of stop words include “a,” “the,” and “with.”
More content at PlainEnglish.io.
Sign up for our free weekly newsletter. Follow us on Twitter, LinkedIn, YouTube, and Discord.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot