Totally agree with @PavelASamsonov UX Design research isn’t about producing/writing output “persona” document. It’s about designing, setting up & running your experiment to prove/disprove human behavior hypothesis.
This would be like a chemist not bothering with the laboratory experiment or Pharma not bothering with clinical trials & letting LLM come up with words, cos cheaper & unethical as hell!
> "No, AI user research is not “better than nothing” — it’s much worse"
- https://uxdesign.cc/no-ai-user-research-is-not-better-than-nothing-its-much-worse-5add678ab9e7
How could you ... discover anything new? Learn anything?
The whole point of research is when it surprises you. When they user keeps doing something you didn't expect and you don't know why.
How could AI ever ever ever produce this most rare but also most valuable of data?
All it can do is make results that ... look like other results, that say what you expect them to say. How do people keep missing the point of what LLMs can and CAN NOT do?
@futurebird @dahukanna @PavelASamsonov why? because that’s not exactly what they’re doing. as you scale up model size, new capabilities emerge, things they weren’t trained to do. “emergent behavior” isn’t a theory, it’s an observation. the open question is, what other sorts of capabilities will emerge when we scale further up. will they acquire an element of surprise? idk, i’d say no but i’ve also been wrong so far about what their limits should be
@kellogh @dahukanna @PavelASamsonov
Even if you did get something that most people would agree had to be called "new" (very subjective) it's not going to tell you anything about how people use software. Because the data didn't come from people.
@futurebird @dahukanna @PavelASamsonov hmmm…
@kellogh @futurebird @dahukanna @PavelASamsonov it's true that LLMs can generate novelty in a recombinatory or juxtapositional sense; after all, that's precisely what the "hallucinations" aka bullshit results are. They're novel constructions; it's just that they don't relate to reality, they are not true. There are many possible statements about any given real world situation, but many fewer true ones, and the LLM has no ability to distinguish truth. We see this in the chemical modeling...
@kellogh @futurebird @dahukanna @PavelASamsonov where the model generates seemingly very large numbers of new possible chemicals, drugs or proteins or whatnot. But then experts review the results and say most of them are implausible or useless.
You can generate novelty through randomness. Novelty itself isn't value, because most new statements about the world haven't been uttered before precisely because they're false. The problem here is that the bullshit sounds "truthy" as Colbert coined it.
@mrcompletely @futurebird @dahukanna @PavelASamsonov yep, agreed. what LLMs do today is just “system 1” with a little faking “system 2”, if that makes sense. but it’s hard to say if those other aspects won’t spontaneously emerge with scale. then again, are there easier ways to develop those systems? like, maybe symbolic reasoning will emerge, but why not just wire in our existing systems that do it?
@kellogh @futurebird @dahukanna @PavelASamsonov fundamentally the issue to me is that these are not cognitive systems but they are being treated as if they are. They're linguistic pattern matching systems. That's not what minds are. The methods an LLM uses to arrive at output have no parallels in modern cognitive science. So why would thought-like states emerge? It's like throwing soup ingredients in a blender and expecting a working car to pop out if you just keep adding carrots.
@kellogh @mrcompletely @futurebird @PavelASamsonov
I would argue Kahneman’s “System 1” is grounded in empirical observation & analysis of embodied, generational psychosomatic feedback. Anyone individual person’s “system 1” that didn’t stand up to the “test of reality” was eventually extinguished & didn’t live long enough to procreate.
LLMs have no “reality” feedback+pruning mechanism so I would not compare them to “system 1 & system 2” (map/model) for humans assessing risk in decision making.
@dahukanna @mrcompletely @futurebird @PavelASamsonov i’ve found it as a useful way of thinking about whats going on, i don’t mean to assert that it’s actually how it works, but more that if all a person had was a very advanced system 1, it would look a lot like an LLM. here’s a paper that uses the same analogy https://arxiv.org/abs/2212.05206
@kellogh @mrcompletely @futurebird @PavelASamsonov
Thanks for sharing and will take a look.