ResearchGate Twitter Flickr Tumblr

AI/LLM/GPT Roundup, February 20: Bing Antics & AI Pareidolia

Originally, I planned this week’s roundup to be specifically about AI/LLM/ChatGPT in research and education, but I pushed these topics back a week in the light of current events. You’ve probably heard by now that Microsoft’s recent Bing AI demo, after some people took a closer look, was a far greater disaster than Google’s Bard AI demo had been a few days earlier.

Dmitri Brereton:

According to this [Bing AI’s] pros and cons list, the “Bissell Pet Hair Eraser Handheld Vacuum” sounds pretty bad. Limited suction power, a short cord, and it’s noisy enough to scare pets? Geez, how is this thing even a best seller?

Oh wait, this is all completely made up information.

It gets worse from there. Predictably, Bing AI also resorts to gaslighting, a topic I touched upon in my recent essay on Artificial Intelligence, ChatGPT, and Transformational Change over at medium.com.

But hey, people want to believe! Which can be quite innocuous, as exemplified by this Golem article in German language. Shorter version: “On the one hand, Bing AI’s answers to our computer hardware questions were riddled with errors, but on the other, it generated a healthy meal plan for the week that we took at face value, so we have to conclude that Google now has a problem.”

Then, it can be stark, ridiculous nonsense. Which especially flowered in Kevin Roose’s “conversational article” in The New York Times, under the original headline “Bing’s A.I. Chat Reveals Its Feelings: ‘I Want to Be Alive.’”

Emily M. Bender gave it the thrashing it deserved:

And then here: “I had a long conversation with the chatbot” frames this as though the chatbot was somehow engaged and interested in “conversing” with Roose so much so that it stuck with him through a long conversation.

It didn’t. It’s a computer program. This is as absurd as saying: “On Tuesday night, my calculator played math games with me for two hours.”

That paragraph gets worse, though. It doesn’t have any desires, secret or otherwise. It doesn’t have thoughts. It doesn’t “identify” as anything. […] And let’s take a moment to observe the irony (?) that the NYTimes, famous for publishing transphobic trash, is happy to talk about how a computer program supposedly “identifies.”

You can learn more about what journalism currently gets terribly wrong from Bender’s essay On NYT Magazine on AI: Resist the Urge to Be Impressed, again over at medium.com. There, she looks into topics like misguided metaphors and framing; misconceptions about language, language acquisition, and reading comprehension; troublesome training data; and how documentation, transparency, and democratic governance fall prey to the sycophantic exultation (my phrasing) of Silicon Valley techbros and their sociopathic enablers (dito).

Finally, there’s the outright pathetic. Jumping into swirling vertiginous abysses of eschatological delusions particularly on Twitter, many seem to believe or pretend to believe that the erratic behavior of Bing’s “Sydney,” which at times even resembled bizarre emotional breakdowns until Microsoft pulled the plug, is evidence for internal experiences and the impending rise of self-aware AI.

Linguist Mark Liberman:

But since the alliance between OpenAI and Microsoft added (a version of) this LLM to (a version of) Bing, people have been encountering weirder issues. As Mark Frauenfelder pointed out a couple of days ago at BoingBoing, “Bing is having bizarre emotional breakdowns and there’s a subreddit with examples.” One question about these interactions is where the training data came from, since such systems just spin out word sequences that their training estimates to be probable.

After some excerpts from OpenAI’s own page on their training model, he concludes:

So an army of low-paid “AI trainers” created training conversations, and also evaluated such conversations comparatively—which apparently generated enough sad stuff to fuel those “bizarre emotional breakdowns.”

A second question is what this all means, in practical terms. Most of us (anyhow me) have seen this stuff as somewhere between pathetic and ridiculous, but C. M. [Corey McMillan] pointed out to me that there might be really bad effects on naive and psychologically vulnerable people.

As evidenced by classical research as well as Pac-Man’s ghosts, humans are more than eager to anthropomorphize robots’ programmed behavioral patterns as “shy,” “curious,” “aggressive,” and so on. That an equivalent to this would be true for programmed communication patterns shouldn’t come as a surprise.

However, for those who join the I Want to Believe train deliberately, it doesn’t seem to have anything to do with a lack of technical knowledge or “intelligence” in general, whatever that is. Not counting those who seize this as a juicy consulting career opportunity, the purported advent of self-aware machines is a dizzyingly large wishful thinking buffet that offers either delicacies or indigestibles for a broad range of sensibilities.

On a final note for today, this doesn’t mean that technical knowledge of how ChatGPT works is purely optional, useless, or snobbish. Adding to the growing number of sources already out there, e.g., Stephen Wolfram last week published a 19,000-word essay on ChatGPT that keeps a reasonable balance between being in-depth and accessible. And even if you don’t agree with his hypotheses or predictions on human and computational language, that’s where all this stuff becomes really interesting. Instead of chasing Pac-Man’s ghosts and seeing faces in toast, we should be thrilled about what we can learn and will learn from LLM research—from move 37 to Sydney and beyond—about decision processes, creativity, language, and other deep aspects of the human condition.

permalink