You are viewing archived messages.
Go here to search the history.

Ivan Reese 2024-03-04 03:11:45

Future of Coding 70 • Beyond Efficiency by @Dave Ackley

Dave Ackley’s paper Beyond Efficiency is three pages long. With just these three pages, he mounts a compelling argument against the conventional way we engineer software. Instead of inflexibly insisting upon correctness, maybe allow a lil slop? Instead of chasing peak performance with cache and clever tricks, maybe measure many times before you cut. So in this episode, we’re putting every CEO in the guillotine… (oh, that stands for “correctness and efficiency only”, don’t put us on a list)… and considering when, where, and how to do the robust thing.

Personal Dynamic Media 2024-03-04 19:41:03

I was unaware of the Knuth response to Naur. Thanks for mentioning it! I found a copy at

tug.org/TUGboat/tb10-4/tb26complete.pdf

Knuth invented a new kind of documentation, one that hardly anyone uses, but that is specifically designed for communicating how a program works to other human beings.

Knuth has also expended great effort in the study of other people's code and programs, including code written in long dead programming languages.

If there is anyone in the world capable of transcending the limits described by Peter Naur, both by transmitting the theory of a program and by recreating it, it would be Donald Knuth. I see no reason to doubt the truth of Knuth's claims, but I also don't see them as contradicting Naur.

Naur does not claim it is impossible to revive a program in practical terms, only that it is difficult, frustrating, and time-consuming, and "may lead to a revived theory that differs from the one originally had by the program authors and so may contain discrepancies with the program text." I believe his point is that you cannot be certain the revived theory is the same as the original theory, however I do not have enough experience with literate programming to judge Knuth's claim that a well-written literate program might have a good chance of being accurately revived.

Calling the stored program computer a "von Neumann model" does a tremendous disservice to J. Presper Eckert who invented and wrote up the idea around 6 months before von Neumann joined the ENIAC project. See the book A History of Computing in the Twentieth Century for a copy of the original memo.

von Neumann wrote a draft report that was widely shared informally (en.m.wikipedia.org/wiki/First_Draft_of_a_Report_on_the_EDVAC), but to the best of my knowledge he never claimed the ideas were his. He was writing up the ENIAC team's plans for the EDVAC.

Y'all may also enjoy von Neumann's paper "PROBABILISTIC LOGICS AND THE SYNTHESIS OF RELIABLE

ORGANISMS FROM UNRELIABLE COMPONENTS." static.ias.edu/pitp/archive/2012files/Probabilistic_Logics.pdf

Daniel Buckmaster 2024-03-04 21:10:00

I was glad to hear the new discussion generated by Programming as Theory Building - both the episode and the paper. It is my favourite episode and was very influential on me!

Tom Lieber 2024-03-05 02:38:37

I appreciated the examples of non-distributed systems that benefit from robustness that had to do with being robust to ~programmer error~ . That type of error is harder to characterize than the random bit-flipping of cosmic rays because it’s so human, but it’s the type of error that I most often think of robustness in terms of.

I didn’t have as good a word for it before. “Defensive programming” doesn’t really capture it.

Tom Lieber 2024-03-05 02:48:36

Implementing invariants directly like Jimmy mentioned. Sort the thing every time if it’s supposed to be sorted, rather than trying to maintain that property indirectly. It’s not just about doing the easiest thing first, or avoiding premature optimization. It’s like, when I mess up code elsewhere, how do I make sure that this part won’t make it worse.

Tom Lieber 2024-03-05 02:48:54

I dunno, good episode.

Alexander Bandukwala 2024-03-06 05:13:12

I still haven’t read the paper but one aspect of the episode I found interesting was that having simpler software avoids bugs. It seems like this is being conflated with the idea of sacrificing efficiency for robustness. Where sometimes the simpler code/algorithm is in fact less robust and the more robust implementation requires more code (and potentially more bugs).

Alexander Bandukwala 2024-03-06 05:18:54

I’d be interested in trying to disentangle the robustness from the simplicity dimensions when making tradeoffs. So finding new ways to structure software to be inherently more robust to bugs seems compelling yet difficult.

Overall the contention between correctness, efficiency, and robustness seems to arise from the viewpoint that correctness is a binary proposition rather than a probabilistic measurement of the values we want our software to achieve. If we have a myopic view of correctness we’re leaving all the tradeoffs off the table.

Ivan Reese 2024-03-06 05:20:51

Something we ought to consider — was stuxnet robust-first?

Ivan Reese 2024-03-06 05:22:07

And yeah — I'm no friend to binary views of correctness! Glad to be reminded of that.

William Taysom 2024-03-06 08:18:07

Ivan Reese loved the musical interlude and the mix on the quotation effect seemed perfectly dialed in.

Scott Antipa 2024-03-08 02:45:40

Reminds me of how analog computers can be more robust because they arent susceptible to things like accidental, cosmic ray style, bit flips causing a major change in the value of the computation.

Ivan Reese 2024-03-08 03:07:58

Right. Though they then need to be robust against, say, results being influenced by ambient temperature :)

Daniel Buckmaster 2024-03-08 03:15:04

Deutsch discusses digital versus analogue at length in The Beginning Of Infinity , here's a bit from that chapter-

... during lengthy computations, the accumulation of errors due to things like imperfectly constructed components, thermal fluctuations, and random outside influences makes analogue computers wander off the intended computational path. This may sound like a minor or parochial consideration. But it is quite the opposite. Without error-correction all information processing, and hence all knowledge-creation, is necessarily bounded. ... So all universal computers are digital; and all use error-correction with the same basic logic that I have just described, though with many different implementations. Thus Babbage’s computers assigned only ten different meanings to the whole continuum of angles at which a cogwheel might be oriented. Making the representation digital in that way allowed the cogs to carry out error-correction automatically: after each step, any slight drift in the orientation of the wheel away from its ten ideal positions would immediately be corrected back to the nearest one as it clicked into place. Assigning meanings to the whole continuum of angles would nominally have allowed each wheel to carry (infinitely) more information; but, in reality, information that cannot be reliably retrieved is not really being stored.

William Taysom 2024-03-08 09:09:15

An analog virtue / limitation is that you cannot have a huge tower of abstraction because noise accumulates: indirection has a direct cost!

Ivan Reese 2024-03-08 14:20:17

Relevant reading folks might enjoy: The dry history of liquid computers

Tony Fader 2024-03-11 01:31:25

Thanks for the great episode. "The Fiverr Vaccine" was super funny. And I loved reading the paper.

I started out robustbrained. I was ready to salute the robustness flag. I started memorizing the robustness national anthem (which is twice as long as it needs to be).

But now it feels like that's missing the point...

I should be saluting the local first flag! I should sing the permacomputing national anthem and get my hair done at the convivial computing salon! These are actual value systems that imagine a different world and say "this would be better". Robustness is a means to an end, just like efficiency.

🕰️ 2024-01-06 19:02:44

...

Stefan Lesser 2024-03-04 15:14:25

As I keep writing my article series On Simplicity… I’d like to further improve it with feedback and have now set up a first online discussion for it.

On Thursday, March 14th we’ll start with discussing the first post in the series. You don’t need to be familiar with the whole series; just reading the first post is recommended but not required. Have a look at the Luma invite for the exact time in your time zone and to sign up (it’s a free event via Zoom).

Would be great to have some of you there!

Luifer De Pombo 2024-03-05 18:31:27

sharing some recent thoughts I have had about verifying LLM-generated bugfixes automatically with cloud infrastructure: lfdepombo.com/cloudbugfix. Today we validate LLM-generated code by looking at it or manually running it within our codebase. However if the expected behavior of the code is verifiable, there is a less painful workflow where the mistakes made by the LLM are not visible to us.

Doug Thompson 2024-03-06 13:13:49

Alrighty, here's the post I mentioned I'd make in #introduce-yourself:

I want to create a computing ecosystem that solves most of the problems in what I call 'unregulated I/O'.

It is quite possibly mad. Or it might work, and I will be surprised.

It takes design cues from Oberon, FlatBuffers, IPFS, Git, Rust.

It also sounds dangerously close to the kind of "great idea" a compsci undergrad would come out with. Yet, I am running out of reasons why this isn't possible (at the very least). This is why I want your opinions 😅

That's all I'll say here - rest is in the 🧵

Doug Thompson 2024-03-06 13:14:33

So, I've got a problem with this thing I called 'unregulated I/O'. Here's what I mean by this:

  • Unix set the standard of modelling files as byte arrays in the 70s.
  • Likewise, storage I/O, IPC and RPC is mostly done via byte array. There are some exceptions - for example:
  • The OS normally abstracts away most packet handling up to the transport layer.
  • Windows has dabbled with wacky ports (COM1 et al.)
  • This means that application programs have the responsibility of validating the binary data loaded via I/O.
  • Improper validation accounts for the vast majority of attack vectors (especially if we include memory management bugs).
  • Most modern applications employ widely-used libraries to minimise the amount of custom validation they have to perform, which is fair, because more eyes are on the libraries.

Nevertheless, SQL injection and buffer overflows still happen in 2024. Exploit accessibility seems (to me) likely to increase with the employment of LLMs.

Doug Thompson 2024-03-06 13:14:48

I'm proposing that the Unix model should be replaced with something more secure by design:

  • An abstract data model should be established for I/O:
  • The OS should abstract away a reasonable amount of validation.
  • Syscalls in applications would be typed. For Rustaceans, I/O calls would yield something like a std::io::Result<T> .
  • The available types should include those that application programmers want to get I/O'd data into ASAP: scalars, arrays, maps, tuples/structs/enums (the latter of which should be Rusty).
  • We would certainly require encapsulation, and (possibly) higher-kinded facilities like mixins.
  • All data in this system should be represented this way, including programs themselves.
  • This means that program source code is "already an AST".
  • Plain text would not be used for source code.
  • UI development for these 'structured languages' must be improved. (maybe I should've said Scratch was an influence? 😏)
  • These ASTs should be transformed by the OS into machine code (which can also be represented with this model: a .s file becomes an array of instruction enums).
  • Eventually, the OS running this should be able to self-host in this way.
  • Applications should barely ever concern themselves with any kind of binary data, though this is of course impossible to prevent in a Turing-complete environment.
  • Data, as stored, should be content-addressable:
  • Joe Armstrong has plenty of reasons why this is a good idea (especially for greenfield).
  • The equivalent of a 'filesystem' for this ecosystem would instead be what is effectively a hashmap with wear leveling.
  • Or, Optane could be revived (some hope). Would be nice to design around this possibility.
  • 'Files', now more accurately 'objects', are stored by a hash of their contents.
  • Important note: this is not object-oriented computing. We don't want to be piping vtables.
  • The need for encapsulation means that our 'filesystem' effectively becomes a Merkle tree.
  • In order to prevent massive hash cascades when writing to storage, we would need to employ mutable references (in a similar manner to symlinks).
  • Fast random-access updates to very large objects could be achieved with a hasher suited to incremental verification, such as BLAKE3.
Doug Thompson 2024-03-06 13:15:14

Here are some fun implications of such a design:

  • New programming languages would be required.
  • Deduplication of data becomes trivial.
  • On the subject, we'd need to be mindful of granular we are with the storage of heavily-encapsulated objects.
  • Denormalisation should probably happen when eg. objects' raw data is less than the size of a hash digest (at the very least).
  • Transfer of large objects over a network can be heavily optimised. Downloads effectively become a git pull .
  • Core web technologies such as HTTP, HTML, CSS & JavaScript are no longer kosher, because they are based on plaintext.
  • "This obsoletes the web" is a silly thing to say, but could be fun in the pitch.
  • All of these formats could be transformed into the strongly-typed model presented above, though.
  • Tabs vs spaces is no longer a concern, because formatting is no longer a concern for plaintext. That's now the UI's responsibility.
  • Entire classes of attack should be all but eliminated (eg. injection).
  • The types used in the data model can themselves be represented in the data model, and we can relatively easily implement internationalisation for code:
  • Here's a horrible illustration: Enum Type { i8, i16, i32, i64, u8, u16, u32, u64, Array<Type>, Map<Type, Type>, Tuple<Type, ...>, Enum<Array<Type>> }
  • These types don't have canon names, and I don't think they should.
  • They do have hashes, though. So we can refer to types by their hash.
  • We can then map human translations for these types and their encapsulated members in any number of natural languages: Map Translations<Tuple<[Locale, Type]>, Array<String>>
Doug Thompson 2024-03-06 13:15:39

I appreciate this is a lot, so if you've taken the time to read this, thank you ❤

Please shoot with your questions and comments. I've got some visual explainers for this stuff lying around which I'll probably add too.

Doug Thompson 2024-03-06 13:19:26

...oh, and: much as I've searched, I can't find a project that's attempting to create an entire ecosystem out of these principles (even if it's just using VMs rather than an entire OS). If you know of any project doing this, please let me know, because I suspect they're probably doing a better job.

Konrad Hinsen 2024-03-06 17:40:06

I see the main problem of your project in the wish to design a complex system from scratch. Such projects have basically always failed, for running out of steam before accomplishing anything useful.

One of the insights from John Gall's "Systems Bible" (highly recommended!) is (chapter 11): "A complex system that works is invariably found to have evolved from a simple system that worked" with the corollary that "A complex system designed from scratch never works and cannot be made to work. you have to start over, beginning with a working simple system."

That's in fact how today's computing systems evolved over a few decades. The result is a bit of a mess, but it works. And it's so big by now that it cannot be replaced, only evolved.

Daniel Garcia 2024-03-10 02:41:58

Are you familiar with unison lang ? As you mentioned they aren’t attempting to create an entire ecosystem, but I think has a lot of overlap with your ideas

Konrad Hinsen 2024-03-10 07:56:15

Unison and IPFS are indeed the two main existing projects that have the most overlap. Neither tries a from-scratch approach. But unfortunately, the two don't really coexist well either, having their own content-addressing scheme each.

Another language in that space is scrapscript.

Doug Thompson 2024-03-11 09:46:00

Thanks Konrad Hinsen Daniel Garcia - those are exactly what I'm looking for 🙏

I actually don't want to have to "make something big", because yeah, I've also seen countless examples of things of this scale failing (or worse, leaving a stain on its surroundings... (cough) Windows Registry). I don't want to have to make an OS, but having the entire software ecosystem playing to the same conceptual tune is going to make things all the more sound - if that makes sense.

Making a VM of it, in the same way as Unison or Scrapscript are doing (if I'm understanding them correctly), is where I'd want to start too. So I think I'm going to reach out to the authors of both and ask what they think about scaling them up.

Michael Jelly 2024-03-06 16:50:34

If you’ve wondered:

  • why the only copilot we have is for VSCode
  • why not every app is end-user programmable

I wondered the same thing, and I’ve built (omnipilot.ai), an AI copilot that works everywhere on macOS.

Specifically it lets you invoke GPT to type into any app ( particularly interesting to me is it works great in Xcode ), can also autocomplete text in any app, and lets you chat with GPT-4 with context from your recent apps. I’d really appreciate any feedback or first impressions!

Re end-user programmability I’m also working on making it more possible for people to make little “automations” on their computer, whether it’s adding buttons to Finder to convert files or recording little AI-enhanced macros.

Some specific questions I’d love feedback:

  • How often do you find yourself wanting help editing code outside of a Github Copilot-enabled environment? What are those situations or apps?
  • What about text, do you wish you had a copilot for text too?
  • What do you think of the “works everywhere” approach vs. a dedicated app?
  • Do the AI-macros sound appealing or meh?
  • Any thoughts on the landing page copy/design?

I’m also happy to answer any other questions. Thanks in advance for sharing your thoughts, it’s super helpful in shaping the product!

Daniel Sosebee 2024-03-06 17:09:58

Does it work in the terminal? That’s a big place I want autocomplete that I don’t currently have it.

edit: I have now checked out the landing page and see that it does 🙂

Daniel Sosebee 2024-03-06 17:12:26

Super cool. How do you build context? is it always based on the current text buffer?

Daniel Sosebee 2024-03-06 17:18:25

It would be super cool if the clipboard contents were always appended to context. That way you could google/chatgpt a question, highlight and copy the answer, and then focus in your code editor etc. and get more relevant completions. In general I’m interested in UI that allows people to manage context more explicitly

Michael Jelly 2024-03-06 17:29:47

Yes! Builds context based on what’s in the current app

Michael Jelly 2024-03-06 17:30:01

when you select text and press Option Space

Michael Jelly 2024-03-06 17:30:08

then that text goes into the context

Daniel Sosebee 2024-03-06 17:36:49

Awesome, excited to try it out!

Michael Jelly 2024-03-06 17:51:18

also! you can also @tag windows that you’ve had open recently to add them to the context in the option space chat

Michael Jelly 2024-03-06 17:51:38

and any recently selected text too

Achille Lacoin 2024-03-07 10:19:01

For the "copilot in the shell" usecase, there is butterfi.sh

📝 Butterfish - A shell with AI superpowers

Add easy, context aware AI prompting with OpenAI to bash and zsh shells