hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

9.4K
active users

Computers have been beating the best human Go players since 2016. The Go world champion retired in part because AI is “an entity that cannot be defeated.”

But a human just trounced one of the world’s best Go AIs 14 games to 1: arstechnica.com/information-te

I think this news story is more interesting than it might first appear (without knowing details, so grain of salt). It isn’t just a gaming curiosity; it points to a fundamental flaw with “deep learning” approaches in general.
1/

Ars Technica · Man beats machine at Go in human victory over AIAmateur exploited weakness in systems that have otherwise dominated grandmasters.
Paul Cantrell

The Go AI was trained by feeding it a huge number of Go games. It built a model based on w̶h̶a̶t̶ ̶h̶u̶m̶a̶n̶s̶ ̶d̶o̶. CORRECTION: KataGo is trained by playing against itself; the model input is “past AI games.”

The human beat it by doing something so obvious a human or sensible AI would never do it — so it wasn’t in the training data, so the AI didn't counter it.

(Basically, the human forms a conspicuous giant capture ring while distracting the AI with tactical battles the AI knows how to counter.)


2/

Why is this interesting?

The recent eye-popping advances in AI have come from models that scan huge datasets, either generated by humans (e.g. “all competitive Go games” or “all the text we could find on the web”) or by computer (“the AI plays itself a billion times”), and imitating the patterns in that dataset — with no underlying model of meaning, no experience to check against, no underlying theory formation, just parroting.
3/

These systems produce such striking results, it raises the question of whether our brains are any different. Are we also just fancy pattern mimics?

Would an LLM gain human-like intelligence if only we had more processing power?

This result suggests that no, there’s something else our brains do. The tactic the AI missed would be painfully obvious to a human player. There’s still something our brains do — theorizing, generalizing, reasoning through the unfamiliar — that these AIs don’t.
4/

As usual on Ars, there are actually some good comments.

This person wisely reminds us that AI has been littered with bold predictions that human-like AI is just around the corner — and every time, we realized that we’d failed to understand what the hard part even was. The Marvin Minsky quote here is eye-popping:
arstechnica.com/information-te
5/

But does this matter? Sure, the AI did a faceplant on some bizarro strategy that would never fool a competent human. So what? It’s just a board game.

OK, what if it’s a self-driving car?

Have you ever encountered a traffic situation that was just totally bizarre, but had a common-sense solution like “wait” or “just go around?” What would an AI do in that situation?

Think of the reports of self-driving Teslas suddenly swerving or accelerating straight into an obvious crash.
6/

Or as this comment remarks: what if it’s an autonomous killbot? (arguably a superset of the previous item, I know) arstechnica.com/information-te

Several comments point out the parallels to the recent story about Marines defeating an AI with Jim-Carrey-style nonsense antics that bore no relationship to the training data: arstechnica.com/information-te

The linked article: taskandpurpose.com/news/marine
7/

This is probably what’s going on with the hilarious ChatGPT faceplants making the rounds on social media.

People try to fool GPT with esoteric questions, but those are easy for it: if anybody anywhere on the web already answered the question, no problem — and making it esoteric just narrows the search space.

But give it a three-digit addition problem, and there’s no single specific example to match. And GPT can’t turn all those examples into a generalized theory of how to do addition.
8/

What’s the moral here?

1. Beware the AI hype.

2. Know your AI history, at least a little.

3. If AI ever does become intelligent (whatever that means), it’s going to be •weird• for a long time first.

4. Beware your own heuristics about what “intelligence” looks like, because •that• is a place where our brains are as easily fooled as the Go AI was. Our brains didn’t evolve around things like GPT, and we’re easily bamboozled by them.
/end

Correction from some replies:

Commenters correctly pointed out that the specific AIs in question (KataGo and Leela Zero) did not use human play as training data, only AI play. Thanks! Edited posts to fix.

Correction of some replies:

None of the variants of AlphaGo are the AI in question.

From @Iguanadelmar: Yes, interesting indeed: kolektiva.social/@Iguanadelmar

The point of my thread was that deep learning has weird blind spots that a human would be unlikely to develop, and that these blinds spots are intrinsic to the approach.

The additional insight I take from the fact that it was computer analysis that found the AI’s weakness is that our human metacognition makes it hard for us to find deep learning’s blind spots. Again, we’re too easily bamboozled.

kolektiva.socialiguana (@Iguanadelmar@kolektiva.social)@inthehands@hachyderm.io interesting to note that the tactic the human used to defeat the AI was suggested by a different AI that tried out millions of tactics to find the AI Go champion's weakness..

From @psu_13: This paper gets at the underlying problems of that now-embarrassing Minsky quote. It gets at many of the things I was talking about upthread, and much more beside. Thanks for the recommendation!
pgh.social/@psu_13/10989278439

pgh.socialpsu_13 (@psu_13@pgh.social)@inthehands@hachyderm.io Why AI is Harder Than We Think https://arxiv.org/abs/2104.12871

From @Globaltom, a very good example of a human handling a novel problem not in the training data:
mstdn.ca/@Globaltom/1098928805

I tend to think that Roomba had the right idea with respect to AI: build stupid-ish machines that assist humans by handling the easy 80% of dreary tasks, and complement humans instead of trying to mimic humans.

Mastodon CanadaGlobaltom (@Globaltom@mstdn.ca)@inthehands@hachyderm.io I think about the first time I encountered a pilot vehicle (roadworks in rural Alberta). A sign explained how it worked; I was fine. You can't use words to tell a self-driving car how to deal with something novel...

From @mykl: infosec.exchange/@mykl/1098930

The mistake in that line of reasoning is imagining that “intelligence” (whatever that means) is a linear continuum. Computers are already far “smarter” than humans in specific ways. (Can you sum a billion numbers in one second?)

Part of my point is that what we call AI won’t progress toward human intelligence like going up a staircase. It will be •weird• the whole way, and will thwart our intuitions.

Infosec ExchangeMastadon Juan :donor: :pizza_pineapple: :1password: :apple_inc: :loading: (@mykl@infosec.exchange)@inthehands@hachyderm.io Ok, hear me out... what if these systems are, at this moment in time, actually as smart as the *average* human. We may be in for trouble if they get smarter, and do not gain some things like compassion, tolerance, and other attributes that humans were supposed to have.

This is a good clarification from @ekwessel: mstdn.party/@ekwessel/10989344

Please don’t read my thread to mean “Machines can’t handle novel situations,” or “Machines can’t generalize,” or “Machines can’t identify underlying patterns.” All those statements are false! There is no such bright line to be found here.

My argument is a fuzzier one: software is surprising; it’s hard to know what the hard part is; we should expect to see ML impress us but then faceplant on the seemingly obvious.

Mastodon PartyErik Wessel (@ekwessel@mstdn.party)@inthehands@hachyderm.io Great thread! I wanted to point out a small caveat. It is often said ML methods *can’t* generalize outside of their training domain. While this is often the case, it *isn’t always true*: this paper shows simple examples where strong generalization happens from sparse addition examples. It is true big ML models struggle to generalize, but people should know that strong generalization isn’t *impossible*, it just might be hard in practice. https://arxiv.org/abs/2201.02177

Many replies take one of these two viewpoints:

“Ha! Humans win after all!”

“Whatever, brains are just neurons and so is this, therefore human-like AI is still forthcoming.”

Both rather miss the point: “AI systems harbor surprising failure modes.” Yes, humans will patch the problem, and AIs will keep winning at Go.

But those failure modes that •would have been easily detectable by humans• indicate intrinsic gaps in the •structure• of these “deep learning” systems, not just their scale.

The lesson here isn’t about who wins at Go. It’s about the danger of extrapolating from narrow domains (like winning at Go) to open-ended problems (like driving a car) or general human-like intelligence.

Software is surprising.

AI is surprising.

“Intelligence” — whatever that means — is surprising.

We keep realizing we’ve failed to understand what the hard part even is. We are fools if we do not expect that from the outset.

@inthehands I think the difference between cognition and calculation is like the difference between integration and differentiation.

@suzannealdrich "Calculating a derivative is like squeezing a tube of toothpaste, calculating an integral is like putting the toothpaste back into the tube"

@inthehands
Wow, I think this is a close to perfect example for the Chinese Room experiment.
As long as the player does what the AI expects it to do, it’s able to respond to it accordingly. But as soon as it gets strange inputs all of its training is close to worthless because it didn’t understand the basics.

I think this is really big and shows very well where the limitations of AI are.

Here’s a link to the page about the Chinese Room for those who are interested:
en.wikipedia.org/wiki/Chinese_

en.wikipedia.orgChinese room - Wikipedia

@Cyb3rVix3n @inthehands I think the Chinese room was trying to make a different philosophical point--that even an artificial system *that could act in a flawlessly human way* would lack subjectivity or intentionality. Which I have doubts about, for reasons that have been argued about for decades as described on that Wikipedia page.

But these systems aren't actually acting completely like humans; they're just getting parts of it uncannily right, and stumbling elsewhere.

@inthehands interesting to note that the tactic the human used to defeat the AI was suggested by a different AI that tried out millions of tactics to find the AI Go champion's weakness..

@Iguanadelmar @inthehands Exactly. I think a lot of the arguments in the thread don't hold specifically because the human followed a recipe from another AI.

@lana @Iguanadelmar I specifically addressed this interesting point in the postludes.

And no, 80% of the thread doesn't hinge at all on whether it is a human or a computer that identified the weakness.

@inthehands #4 is an important point. We are easily dazzled by a song and dance or a cool trick, and there is a long history of wanting to believe in AI. It's not all fraudulent chess machines with a guy under the table, and some of it is really interesting or impressive stuff, but so far it's mostly Michigan J. Frog—impressive performances, but only under certain very controlled circumstances.

@inthehands
> (Can you sum a billion numbers in one second?
Depends which numbers.

@inthehands @leftpaddotpy “surprising” is doing a lot of work in this sentence. 😂

@steve @leftpaddotpy And it’s good work, imo! Expecting to be surprised is a surprisingly hard lesson for people to come to about software.

@inthehands also, corollary: we shouldn’t be surprised by the oddball failure modes that humans have, either.

@steve Exactly so. There are two difficult layers, both at the heart of engineering: (1) prepare for the unexpected; seek it out relentlessly; mitigate after and mitigate before; build in redundancy; and (2) intuitions about human failure modes create blind spots for us when reasoning about AI; no failure mode is too weird, too stupid, or too obvious to consider.

@inthehands i dunno. Humans stepping on the rake of easily detectable failure modes seems to be the story of this century

@inthehands Fun to speculate what bizarre failure modes our human intelligence might have too. We’re lucky adversarial training would be too slow to be practical.

@inthehands unrelated, I’m totally stealing your use of •bullets• for emphasis. Much more clear than stars or underscores if Markdown isn’t being used.

@inthehands @leftpaddotpy This shouldn't surprise us. Humans also have surprising failure modes. (We're used to them, so we mostly work around them as best we can, but they are absolutely similar failure modes.)

@adredish I would hope that people would have learned not to be surprised when software fails in bizarre-to-humans ways, but as it turns out, that’s a hard lesson to learn!

I reserve my greatest disdain for the engineers who should know better, but keep pumping the GAI hype juice anyway.

@inthehands @mykl I think the "intuition" part is the key, but I'm not convinced ours (though clearly superior at present) isn't just more layers of pattern recognition. Intuition isn't deduction, but isn't just induction, either. It's knowing (or calculating) which probability is most fitting/likely, and choosing where to sort of "jump the fence" and start running inductive tests. I think we do this incredibly well as humans, but not sure this can't be programmed.

@inthehands @Globaltom Funny that. My partner bought one of those devices and we haven't used it in months because 90% of the work is picking up the kids' toys, scraps of paper, couch cushion forts, etc. The actual act of sweeping hardly takes time, is quiet, and mildly cathartic in comparison.

@jedbrown @Globaltom Yeah, I’m convinced there is no working approach to cleaning — not automated, not hired, not systemic, not expert, not anything — for people with kids. We live in the chaos. We breathe the chaos.

@inthehands @jedbrown you learn magic, levitate everything off the floor, then sweep/vacumn underneath. Easy.

@inthehands @psu_13 glad to see AI winter mentioned, it's important context for the latest hype cycle.

AI winter - Wikipedia
en.m.wikipedia.org/wiki/AI_win

en.m.wikipedia.orgAI winter - Wikipedia

@inthehands @Iguanadelmar on the “bamboozling” you highlight, it’s generally helpful to remember that “blind spot” is a term first used for humans!

We have plenty of them. Eg, All those optical illusions you might have seen (of which there are plenty) are “hallucinations” coming out of cognitive or perceptual blind spots.

It may well be a common trait for any form of intelligence with lossy learning.

@inthehands @Iguanadelmar

« The additional insight I take from the fact that it was computer analysis that found the AI’s weakness is that our human metacognition makes it hard for us to find deep learning’s blind spots. Again, we’re too easily bamboozled. »

Adversarial approaches to the rescue?

@SciencesPoulet @Iguanadelmar Adversarial approaches help. The point is that deep learning for critical systems will leave us playing endless whack-a-mole with failures we have trouble anticipating. It’s a tough model on which to build a good engineering practice.

@inthehands my friend and I once defeated a Turing test with a single pun.

@inthehands Re. the weirdness, I’ve thought for some time that our present timeline might look a lot like it does if a childish, malevolent AI was exploring what it might be able to make the built environment do by twiddling various networked knobs and levers.

Then I remember that the childish, malevolent AI’s are called Homo sapiens.

@inthehands what! Formalized definitions of intelligence, whatever could or has ever gone wrong!!

@inthehands A lot of the challenge of AI specialists trying to achieve humanlike intelligence is the classic problem of trying to solve the questions of philosophy with engineering. Some of the greatest minds of history have wrestled with what understanding is and how consciousness operates. Many modern engineers respond with "lol philosophy doesn't answer anything" before crashing into the most basic philosophical problems and completely failing to solve or even understand them.

@glenatron @inthehands Engineers that dismiss philosophy as worthless should not be allowed to work on anything related to humans or the environment (i.e. anything). Dangerous people susceptible to the dumbest scams and capable of inhuman horrors without a shred of conscience or second thoughts.

@arclight @inthehands i don't disagree but it is very common. Every couple of years there's a new book from a physicist claiming to have "solved" philosophy and it's always misbegotten from the start because they haven't grasped that the scientific method and logic and the nature of numbers are all philosophy. If you can solve it with physics it's not metaphysics, that's right there in the name! Other fields do the same and as a philosophy grad it is infuriating.

@glenatron @arclight @inthehands
"It's simple" are often the first words said by someone that doesn't understand a subject 😅

@kingannoy @arclight @inthehands Meanwhile people with deep understandings rarely say anything until asked and their answers often begin with "it depends..."

@inthehands AIs were ultimately always just programs that vomit out highly obfuscated derivatives, much like a human might in theory. They’ll always be bad at tasks they weren’t trained to do—they’re tools that merely seem like magic to anyone who doesn’t understand computer programs.
I wouldn’t be surprised if they’re presented in a very smoke-and-mirrors way in precedent-setting legal cases in the interest of gaining unearned prestige. Humans are in control of that future, for better or worse.

@inthehands Great thread!
I wanted to point out a small caveat. It is often said ML methods *can’t* generalize outside of their training domain. While this is often the case, it *isn’t always true*: this paper shows simple examples where strong generalization happens from sparse addition examples.
It is true big ML models struggle to generalize, but people should know that strong generalization isn’t *impossible*, it just might be hard in practice.

arxiv.org/abs/2201.02177

arXiv.orgGrokking: Generalization Beyond Overfitting on Small Algorithmic DatasetsIn this paper we propose to study generalization of neural networks on small algorithmically generated datasets. In this setting, questions about data efficiency, memorization, generalization, and speed of learning can be studied in great detail. In some situations we show that neural networks learn through a process of "grokking" a pattern in the data, improving generalization performance from random chance level to perfect generalization, and that this improvement in generalization can happen well past the point of overfitting. We also study generalization as a function of dataset size and find that smaller datasets require increasing amounts of optimization for generalization. We argue that these datasets provide a fertile ground for studying a poorly understood aspect of deep learning: generalization of overparametrized neural networks beyond memorization of the finite training dataset.

@inthehands However, I have also succeeded several times without much effort to elicit nonsense answers to esoteric questions from ChatGPT. This probably works for all specialist topics that you know a lot about (in my case e. g. auditing) and for which Google doesn't provide obvious answers.

@stefanieschulte Same here. The specialized topics are the ones that show the underlying texture of the model best, I think: sometimes repeated an accurate and focused summary, sometimes throwing recognizable fragments together into nonsense, and sometimes just plagiarizing.

@inthehands I recall the Bing one faceplanted on not recognizing that a movie's release date was in the past, even though it could tell you today's correct date if you asked. And then it doubled down with seemingly angry language when challenged on the point.

@mattmcirvin @inthehands My favourite part is that, once it started acting arrogant and obtuse, it decided that "the date is actually 2022" was the most likely next token. The mistake it made about the movie's release date damaged its ability to output the correct date.

@wizzwizz4 @inthehands It's an interesting problem for this kind of approach--you have certain categories of text in the corpus describing specific events in time, that were appropriate things to say at the time that text was written, but it's now possible to deduce that the text is obsolete. But only if you have an understanding of how time works.