❇ Thoughts and observations from today's Google I/O.
- Everything they introduced was about AI. When they hit the 2 hour mark, I was ready for them to start talking about, say, new features of Android this year… and then the show wrapped up!
- Corollary of the above: it's really cool to see what it looks like to spread AI across an entire established product ecosystem . OpenAI don't have much in the way of consumer product — they have a thin wrapper around AI as a fundamental technology and that's about it. Google is all product. So beyond just chat UIs and generative prompting, you now (or presumably will soon) get stuff like "How has my daughter improved at swimming over the years?" in Google Photos, and "Where am I?" in the Android camera, "circle to search" on any image across Android or in Chrome, "Role-play as my professor" in docs, etc etc etc. This feels like the first time we're seeing what it'd be like to have AI in absolutely every context, across every UI, aware of all your data.
- Building on the above: it was intriguing, and often a bit baffling, to see where and how they're integrating AI in terms of GUI design . Sometimes it's active in a text field, sometimes you need to tap a button, sometimes it's a popup, sometimes it's a sidebar. Broadly: sometimes it's ambient, and sometimes it's explicit; sometimes it slots into existing UIs, sometimes it's a new dedicated place. Sort of like the sparkle emoji being adopted as the de facto "AI is here" icon, I'm very interested in seeing what patterns emerge around the appropriate placement of AI across all GUIs, since it seems like that's bound to happen. Does it end up like spellcheck, where there's a single universal visual signifier and a clear course of action?
- For their AI APIs for devs, they talked about pricing per 1M tokens. They also talked a lot about their 1M token context window (and at least 3 or 4 times, mentioned it'll be 2M before long). For some kinds of query, the pricing was $7 per 1M tokens — in other words, AIUI, $7 per full context window. It really makes me wonder what consumer uses would justify $7 per inference. I bet there's some really interesting possibility there. Like, what would an AI need to do for me to pay $7 per use? Generate an entire video game based on my Steam library? Generate an album based on my entire iTunes / Apple Music library and listening history? If the quality were good, that feels like fair value.
- I love Gemini as a name for their AI initiatives (the symbolism, the 'G', the mouthfeel of the word — great name, topping "Copilot" which is also a great name). "Gem" as their version of "bot" is also excellent. But "Gemma" is stretching this too far, and it becomes confusing what stuff is "Gemini" and what stuff is "Gemma".
- Speaking of confusing, the show lacked a clear "table of contents" / structure . For WWDC, there's tentpole sections for i[Pad]OS, Mac, Watch, and then a few grab bags (HomePod, TV), and they clearly signpost the sections with a "tell 'em what you're gonna tell 'em, tell 'em, tell 'em what ya told 'em" pattern. For want of this structure, it felt like folks from various branches of Google repeated each other's announcements (eg: context window going from 1M to 2M, Gems for learning). Conway's law, perhaps?
- Love that it was a live presentation with some live demos. Nice set design, too.
Finally, that feature where Android warns you in the middle of the call that it's likely a scam? $5 says it never ships. I think it's this years "Google Assistant on Android will phone a restaurant and make a reservation for you".
Coming from the legal context, I've got a list as long as your arm of stuff you could do with 1M tokens that would be worth $7. Ask any junior associate what part of their job they hate, and they will describe something that fits in the pattern "reading enormous quantities of X looking for Y."
Is Google I/O supposed to be for users or for developers? As a developer, I typically expect platform updates: Android, Chrome, Google Cloud - that kind of stuff. AI Agents in gmail or google workspace doesn't excite me as much.
I'm just an armchair observer, but my sense is that the keynote at Microsoft Build is the most dev-focused, WWDC is basically a 2-hour TV commercial of consumer features, and I/O is in between. But all three of these events draw a lot of attention from the tech enthusiast press, so I think they increasingly use the venue to announce the consumer-facing software feature roadmap for the coming year.
(Apple does their dev-focused announcements in a "Platforms State of the Union" presentation that follows the main WWDC keynote).
Do you know if Microsoft does a similar consumer/developer split with Build?
Unfortunately, no. I haven't deployed anything in Azure cloud ever or for Microsoft platforms (eg: .net).
Like Jason Morris, I can immediately think of professional contexts in which I'd he happy to pay $7 per task (assuming, however, that the output is reliable rather than hallucinated). In my field (scientific research), literature reviews and "smart" database queries are good examples.
For personal use, it's not only the price tag that is an obstacle, but also privacy issues, which I suspect Google did not talk about much.
Google’s broken link to the web
Still, as the first day of I/O wound down, it was hard to escape the feeling that the web as we know it is entering a kind of managed decline. Over the past two and a half decades, Google extended itself into so many different parts of the web that it became synonymous with it. And now that LLMs promise to let users understand all that the web contains in real time, Google at last has what it needs to finish the job: replacing the web, in so many of the ways that matter, with itself.
The web is for bots now. LLMs can do the googling for you.
Are we sure our understanding of progress is directionally correct?