You are viewing archived messages.
Go here to search the history.

Kartik Agaram 2025-03-20 03:46:02

Malleable UIs using AI, from Haijun Xia's lab in UC San Diego

Yining Cao's Jelly work referenced here doesn't seem to be on the lab's site, but this seems to be the paper.

Scott 2025-03-20 15:33:37

Thanks for sharing this! Very good even just 20 minutes in, going to follow up with more thoughts after I can finish it

Kartik Agaram 2025-03-20 15:36:55

It seems like a pretty big negative of AI that crawlers by AI companies are not respecting robots.txt and so hostile to the open web.

I've been hearing about this for years, and I've never understood it. Reading robots.txt is mature technology. I'm curious if anyone here has perspective on the technical/political aspects. (Goes without saying that it's not a good look.)

Ivan Reese 2025-03-20 16:04:14

Frankly I'm surprised any big company adheres to robots.txt. It's a social norm, and most big companies are sort of automatically parasocial / norm-ignorant by virtue of them not being human.

Kartik Agaram 2025-03-20 17:36:10

Google has historically been excellent about respecting robots.txt, and this doesn't seem to be something that has enshittified recently. I haven't heard any anecdotes about traditional search engine traffic misbehaving, for all the poor use they put the results of the crawling to.

In fact the pattern may be that search engines have stopped crawling real websites. It's only the damn rude AIs that remember where the good stuff is anymore :lolsob:

Kartik Agaram 2025-03-20 17:55:24

But yeah, it might be one of those places where people have been historically respecting norms, and all of a sudden everyone notices there are no consequences for not doing so. Otherwise known as the unraveling of society.

Kartik Agaram 2025-03-20 21:24:35

Social norms didn't ever exist in a vacuum. Big companies followed norms by weighing the cost of norms against the PR penalties of not following them. A soft enforcement mechanism is still an enforcement mechanism. Which is why this question is interesting to me. The cost of complying hasn't increased, and it's unclear why the PR penalties are perceived to be low. Every time someone hits a captcha people are going to blame OpenAI or Claude for it.

Kartik Agaram 2025-03-20 21:25:21

I'm trying to understand the unraveling of society from the light it shines on Plato's cave.

Christopher Shank 2025-03-21 01:42:50

(Little rant cause this has been on my mind of recent. Not sure if it's entirely on topic :P)

It really speaks to how the open web is more of a social experiment than an technological one. Don't get me wrong, Tim Berners Lee did a great job encoding the ethos of the open web into its core technologies. For example, the idea of permissionless linking was really counter to other hypertext systems at the time that were closed systems with bidirectional (permissionful) links. But I'm not sure that really guarantees the web staying open and its social boundaries being respected (e.g. robots.txt ). The open web lacks forms of (global?) governance than can prevent private corporations from externalizing their costs (e.g. training LLMs) on to the rest of the web. It felt like early web culture was pervasive enough to balance tides of commercializing the internet (good book on the privatization of the internet). But it feels like we've largely lost that culture as the web grew and slowing seen the commodification of core interactions on the open web. Like in this lovely talk, Anil discusses how Google fundamentally changed the meaning of links from an act of expression to something of economic value and the same thing happened with likes/reposts on social media. The advent of LLMs has commodified a new thing; the content scattered through the web. Feels like the open web is being threaten from above with LLM generated content (the dark forest) from from below with LLM scraping (copyright infringement). Not really sure what to do about that... 😢

Konrad Hinsen 2025-03-21 07:11:59

I see this as part of a more general shift in business, from "rules as formalized consensual moral code that we need to respect in order to be respected" to "rules defining a game that really strong players cheat their way around". It's almost normal nowadays for big corporations to consider fines as no more than a cost that needs to justified by increased revenue, which it often is. The cost for not respecting robots.txt is close to zero.

Ivan Reese 2025-03-21 21:28:01

Relevant section of a tangential but interesting article.

📝 Rewarding ideas

Do we need new incentives for creating information?

Image from iOS