Ivan Reese 2025-03-24 15:33:23 I did some vibe coding with Cursor and it got stuck in a loop of writing a buggy shell script, running it, looking at the output (unchanged because bugs), going "hmm let's fix that", then writing the exact same shell script.
Ivan Reese 2025-03-24 15:37:29 It was 100x more productive at doing this than I would have been.
Konrad Hinsen 2025-03-24 16:14:07 This looks like AI is finally approaching human intelligence levels!
Duncan Cragg 2025-03-24 17:09:16 ChatGPT does this over and over. I point out a mistake, it apologises, promises to correct, then spits out the same mistake.
Andrew Beyer 2025-03-24 18:25:15 I've also seen several cases on various models where they get stuck in a tradeoff, where they will fix a problem by creating another one... and then fix that problem when prompted by putting the first problem back
Kartik Agaram 2025-03-24 18:34:34 @Andrew Beyer Yes, and I used to do this as well for a long time. If you have n bugs to avoid in a system (Christopher Alexander calls them misfits) but your design process can only handle n-2, ping-ponging can result, often over a time of months as a bug gets reported in production, gets fixed, the other bug gets created, gets reported in production..
A lot of the value of tests for me is shortcutting this sort of ping-ponging between bugs. But if you tell AI to write the tests, and give AI carte blanche to modify tests at any time.. π€·ββοΈ
Andrew Beyer 2025-03-24 18:38:26 I found it an even bigger issue when it was more "soft"/nonfunctional issues that maybe couldn't be easily verified w/ automatic testing
Duncan Cragg 2025-03-24 20:44:05 Seems like we need two AIs, each watching the other. Maybe one can write the tests and the other the code
Konrad Hinsen 2025-03-25 06:37:16 Indeed, and I am somewhat surprised that this isn't done yet, given how important the idea of adversarial training has been in the short history of deep learning.
Mariano Guerra 2025-03-25 10:51:41 my new system prompt: you are at a skrillex show waiting for the drop...
Konrad Hinsen 2025-03-26 07:21:50 Ivan Reese According to my understanding, no. They are trained on reasoning stories written by humans. That's a form of supervised learning, whereas adversarial training is unsupervised: two AI models confronting each other. Or even a single model switching sides, as was done with AlphaGo.
General reasoning models need supervision because there is no obvious arbiter for deciding if a reasoning is correct. For code, "compiles, runs, passes tests" provides three consecutive automatable arbiters.
Andrew Beyer 2025-03-24 18:28:18 I feel like I saw a really good literature review "state of the world" wrt visual/graphical programming a while back (probably here, though could have been elsewhere) but apparently didn't save the link and can't find it again...
So, anyone have any favorites or good pointers for something like that?
Andrew Beyer 2025-03-24 18:35:47 ooh, yeah, that wasn't the one I was thinking, which was more of a paper style presentation...but that one's actually maybe even closer to what I need
Ivan Reese 2025-03-24 20:32:27 If you find the one you were thinking of, open an issue so I can add it to the codex!
Duncan Cragg 2025-03-24 20:45:55 I think I know the one you mean, maybe. It was a massively heavy page packed with screenshots? [update: ah, no, not a "paper" style at all!]
Tom Larkworthy 2025-03-28 05:13:16 Tudor Girba 2025-03-28 14:25:03 I am glad it resonates. Please feel free to use that term then π
Did/does the project succeed?
Tom Larkworthy 2025-03-28 14:29:23 that was not my work.
my work is this which does use the term moldable but is also a WIP.