You are viewing archived messages.
Go here to search the history.

Nilesh Trivedi 2025-06-16 04:54:41

AI is a new kind of computer.

  • A traditional computer processes structured data with deterministic instructions.
  • AI processes unstructured data with natural-language nondeterministic instructions.

I like the simplicity of this framing.

But personally, I am more interested in unifying both these kind of computational work: Mathematical (precise & deterministic data structures and instructions) and human-media centric (language, image/audio/video etc) which approximate/ambiguous.

jeffhuber.substack.com/p/ai-is-a-new-computer

Arvind Thyagarajan 2025-06-16 15:41:56

"universal unstructured information processor that can simulate any intuitive procedure (reasoning) given sufficient resources and the proper context"

the "reasoning" part is called into question somewhat here: ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

"despite sophisticated self-reflection mechanisms, these models fail to develop generalizable reasoning capabilities beyond certain complexity thresholds. We identified three distinct reasoning regimes: standard LLMs outperform LRMs at low complexity, LRMs excel at moderate complexity, and both collapse at high complexity."

at one's most cynical one might be forgiven for walking away from this short paper with the conclusion that the current state of large models is a super-expensive super-sophisticated autocomplete? too cynical??

Nilesh Trivedi 2025-06-16 15:50:33

I do believe the Apple paper is way too cynical.

Look up the benchmark score on GAIA leaderboard, then take a look at the sample tasks in the dataset. I bet you will find it ridiculous to call it a "sophisticated autocomplete" then.

Arvind Thyagarajan 2025-06-16 16:00:12

oh no, they've co-opted gaia!! (en.wikipedia.org/wiki/Gaia)

Nilesh Trivedi 2025-06-18 10:41:28

Gemini Flash Lite generating UI on-the-fly:

x.com/OriolVinyalsML/status/1935005985070084197

🐦 Oriol Vinyals (@OriolVinyalsML) on X: Hello Gemini 2.5 Flash-Lite! So fast, it codes each screen on the fly (Neural OS concept 👇).

The frontier isn't always about large models and beating benchmarks. In this case, a super fast & good model can unlock drastic use cases.

Read more: https://t.co/kbkC8CtVYb

Tweet Thumbnail

Jari 2025-06-18 16:01:53

This was cool, thx for sharing

Scott 2025-06-18 16:35:02

Have any of you here spent any time with MCPs at all?

I've just started building an MCP client into this app i'm working on, and it hit me that this could be what enables a lot more of end user modification of programs and a version of Malleable software...though not completely malleable. You don't have to interact with them through a conversational or agentic interface, you can just treat them like RPCs, and if you set up standardized integration points into your application, users can build all types of customizations for at the very least the objects or metaphors within your system

Mariano Guerra 2025-06-18 16:52:39

I do something like that in gloodata.com

not strictly MCPs (since they didn't existed when I started) but the idea is the same

📝 Gloodata

AI-Native Data Apps without the Hassle

Scott 2025-06-18 17:06:33

Oh very cool! Yeah this looks a lot like what I have in mind...are you defining the available visualizations inside Gloodata or do the extensions provide that too?

Tom Larkworthy 2025-06-18 18:22:46

I am massively into agents calling tools in a loop now. I replicated that blog and they are dead on about how crazy it simplifies AI integration. Its the tools that are important, and the agentic loop, and how that isolates the vague AI bit from the engineered determinist tool design part. I did a talk yesterday on it (slides)

MCP is a tool discovery service I don't personally see it as that important unless you need your product to somehow integrate with random data sources (might be true for tool-for-thought but in general its not the main character for me).

📝 The Unreasonable Effectiveness of an LLM Agent Loop with Tool Use

How a simple loop enables powerful AI assistants

Mariano Guerra 2025-06-18 19:19:25

@Scott the visualizations are available in gloodata, the extensions returns data describing what they want to show

Scott 2025-06-18 20:14:55

Tom Larkworthy yeah I'm looking at MCP a little differently and not from an agent/ai calling perspective, but more that its this pattern that enables you to extend your application in other ways that don't rely on LLMs in the loop (though they can be). Similar to what Mariano shared.

I'm working on this Product/Task management app, and just realized we can use the MCP pattern to allow people to add new functionality without much code on our part (add the Github or Asana MCP server and integrate with their services) or allow users to create new functionality in a plugin-like way using the same protocol...

Scott 2025-06-18 20:17:12

Mariano Guerra super interesting! I'm curious if you've seen people build things in that way before? It seems like this was all basically possible pre-LLM but now I'm wondering why it wasn't commonplace...I guess it does touch a little on what Geoffrey and Ink and Switch were talking about in inkandswitch.com/essay/malleable-software...

📝 Malleable software: Restoring user agency in a world of locked-down apps

The original promise of personal computing was a new kind of clay. Instead, we got appliances: built far away, sealed, unchangeable. In this essay, we envision malleable software: tools that users can reshape with minimal friction to suit their unique needs.

Tom Larkworthy 2025-06-18 21:18:21

but the response of a tool is a single unstructured string. I guess thats useful for some human in the loop things or text based workflows but it doesn't seem super general.

Scott 2025-06-18 21:18:53

thats the thing, it doesn't have to be unstructured

Tom Larkworthy 2025-06-18 21:35:14

ok string + mimeType. I think you can embed your own mini DSL in your responses but nothing else can leverage it. There is no response API schema description that I know of, is there? The tool arguments have a OpenAPI schema definition but I don't think there is a corresponding one to the tool or resource output (maybe there is ?)

Scott 2025-06-18 21:41:07

Yeah, there isn't at the moment and I have to imagine it will be coming soon.

But there are a few other options beyond your own mini dsl (which would still count as context for any service that is actually using it, you'd just assign some additional meaning to it and could turn it into a tool call internally).

You could make your MCP servers both server and client to your main app (which itself is also an MCP server and client) making tool calls in both directions

Or if you want to bring another llm into the loop, filter the MCP server's response through a tool calling llm to convert it into the DSL/tool call format for your main app 😉

Scott 2025-06-18 21:42:40

On top of that other 3rd party servers you may not even care about the tool output, it might just give a really nice implementation to that external service - I don't need to write a connection to github and implement all the different types of functionality for modifying git repos if my users can install and configure the github MCP server. Then its just making tool calls for making PRs or commenting on issues, etc and throwing the responses away

Tom Larkworthy 2025-06-18 21:50:05

yeah I guess tools are a bit like CLI commands, you have certain arguments to invoke them and then they do some side effects and then vomit out some stdout which is human readable and thats quite useful, even if the response is just "ok"

Scott 2025-06-18 23:34:01

Wow, looks like structured tool outputs were released as we were talking about them: x.com/chu_onthis/status/1935433647206830428?s=46&t=sAqv4O8SV8AEgeXTXGaORg

🐦 Theodora Chu (@chu_onthis) on X: new mcp spec just dropped: 1. auth is fixed! at last! 2. elicitation now makes it possible for a server to ask an end-user for more info, enabling more agentic behaviors 3. structured tool outputs makes it easier to reason about tool responses 4. more security documentation

Tom Larkworthy 2025-06-19 04:39:41

I just back to post the same thing! Lol. Yeah pretty cool