Future of Coding History

Tomas Petricek 2024-10-01 23:59:57

This is a very random thought - but something that I've been occasionally wondering for some time now. If we have lambda calculus as a model of functional languages and Turing machines as a model of imperative languages, what would be a good model for programming systems that have "document" as the basic underlying structure (Subtext is an example of this) - i.e., you have some tree structure and the program evaluates by modifying this document - appending new things or rewriting evaluated bits. (Lambda calculus itself is basically a tree, but what if this also allows imperative document edits?)

Could this be something like a "Turing machine" that works on trees rather than tapes? There would be "current location" which can move in various ways around the tree and modify it. If your document has references (perhaps you can have ../../foo to refer to foo of a parent of a parent), the machine would have to somehow walk up the tree, remembering that it wants to copy a value back to the original location - and then walk over the tree back to put the value in place of the reference).

Is this something completely silly or something that lots of people have already done but under different names?

Jimmy Miller 2024-10-02 03:30:27

I might be missing some nuance here. But what you are describing sounds to me pretty similar to term or graph rewriting systems. en.m.wikipedia.org/wiki/Rewriting#Term_rewriting_systems They are Turing complete formalism. And there has been plenty of working using them for transforming documents, but also general evaluation. They do work exactly as you described. You have a tree structure and you modify it by rewriting things.

Konrad Hinsen 2024-10-02 05:32:27

Supporting Jimmy Miller's suggestion because that's exactly how my Digital Scientific Notation works. It's a term rewriting system embedded in a Wiki-like graph of cross-linked pages.

Jack Rusher 2024-10-02 06:04:05

Note also that Lisp was originally conceived of as an automated form of term rewriting, the keyword “lambda” having been borrowed based on an incomplete understanding of Church’s paper. The first Lisp that was actually based on the lambda calculus was scheme.

Duncan Cragg 2024-10-02 08:23:56

you're getting some "already done under different names", so I'll add mine! my PL is really just a graph rewriter.

Kartik Agaram 2024-10-02 14:35:46

@Robin Allison's 💬 #linking-together@2024-09-25 seems tantalizingly related, but breaks my brain to think about..

[September 24th, 2024 11:54 PM] robinps2: Hey future of coding folks,

I want to advertise the idea of non-abelian spreadsheets. The idea has slowly drifted into the center of my thinking this last year. I'm not sure if its a good idea or not. It kinda depends on how you build on it. So for now I just want to convey the general idea.

Picture in your mind a normal spreadsheet. In some sense it is 'abelian' (commutative) because from any cell going down and then right is the same as going right and then going down. If we make it non-abelian, so the order we go right and down matters, we get something like the picture attached below.

If you tilt your head slightly you may recognize it as the infinite binary tree. So an infinite binary tree is just the non-abelian version of the usual grid-based spreadsheet. The nodes of the tree are the cells. We can also think of finite binary trees as the analogue of tables.

A key feature of regular spreadsheets is the ability to write formulas with relative references. For instance in a regular spreadsheet you can use relative references so a formula always refers to the cell to the right of the given one, and in a tree you can write a formula that always refers to the cell you get by going down and to the right from the given cell.

Another key feature of spreadsheets is that you put stuff in cells! And we do that with trees all the time. For example if we write down the syntax tree for (a+b)*c what we are doing is putting each of the symbols into a cell of the tree.

We can push this analogy to account for all trees (in particular all syntax trees). This tree can't really be visualized because it branches infinitely at each node. It is much easier to describe algebraically. I'll use the term 'free monoid on a set X', which if you aren't in the know just means the set of strings made out of the elements of X regarded as distinct characters. The infinite binary tree, or more precisely the set of nodes of the infinite binary tree, can be described as the free monoid on a two element set {L, R}. e.g. RLL describes the node you get by going right, then left, and then left again. Now let X_n denote a set with n elements and X the disjoint union of the X_n for all n. It suffices to take the free monoid on X.

A reasonable question at this point is what is the interface for an infinitely branching tree? You would think it is even worse than an infinite dimensional grid, which is the abelian version. But if we are restricting ourselves to trees coming from symbolic expressions then for the most part we already have the interface. It is just the symbolic expressions we would have written down in the first place.

I'll leave it at that.

Tomas Petricek 2024-10-02 21:59:51

Term rewriting is a nice reference I did not think of! I guess one difference between those and what I've been thinking about is that I imagined that you'd have a special "current" location in the tree (like instruction pointer...).

You can certainly do this with term rewriting systems too though, if you just have a special term like [X] that marks the term/tree node X as being the current one.

I guess term rewriting systems are basically how people define operational semantics of programming languages. It's strange people do not talk more about the connection between the two!

Tomas Petricek 2024-10-02 22:04:17

@Robin Allison's idea reminded me of something I wrote about in a post inspired by spaces in cities (see tomasp.net/blog/2023/vague-spaces) There are some thoughts about how programs live in a different kind of space than cities (which have fixed space they have to fit into, whereas program spaces can expand - but spreadsheet space expands in only limited ways - you cannot create arbitrary amount of space in any particular location - which I guess this idea was getting at?).

Tomas Petricek 2024-10-02 22:11:13

... but using some kind of term rewriting system as the basis for document-like programming systems seems like a nice way of doing things - and it looks like there's lots of (some, at least) people here thinking in this direction!

Konrad Hinsen 2024-10-03 06:14:47

Maybe term rewriting systems should have something like a "current node". Rule application order is something usually swept under the rug. It's there, but everybody hopes it doesn't matter, and it's usually implicit (part of the rewriting engine) rather than explicit (part of the rule set).

Robin Allison 2024-10-03 06:22:00

Kartik Agaram I think there might be a connection here too. I don’t know if I can speak to document-based languages in general, but at least for subtext, there is the loose connection in that both are based on spreadsheets. Beyond that non-abelian spreadsheets serve as mathematical models, although they aren’t models of computation specifically. They actually take computation mostly for granted, although I think that can be an interesting perspective too. When I was first reading Tomas’ question it occurred to me that non-abelian spreadsheets could be thought of as a model of the ‘document’ part of ‘document-based programs’.

@Tomas Petricek Part of the point is absolutely that the space these things is is fixed, has particular characteristics, and is not created arbitrarily. I’m not sure if this makes them less like regular programs though. Generally the space of a non-abelian spreadsheet is far more expansive than the two dimensional space of a spreadsheet or paper or city. For one the “two dimensional” non-abelian spreadsheet has uncountably many cells, whereas a normal spreadsheet has countably many cells. And this only gets worse in the countably infinite dimensional case you need to account for syntax trees.

Paul Tarvydas 2024-10-03 11:21:27

@Tomas Petricek

1) WRT to "current node", we already know how to do this using textual code and have many programming languages for this purpose. So, divide-and-conquer says that all you need to do is to map "trees" onto textual code, then you're done.

2) Here is my (probably over-simplified) understanding of term rewriting augmented by the concept of "current node":

Programmer writes an AST (the "inhaler AST")
Programmer writes a 1:1 corresponding rewrite AST that dove-tails with the inhaler AST. (Or, programmer annotates the above inhaler AST (this, though, violates the principles of KISS and human-readability))
Term rewriter app inhales linear text and makes a CST (concrete parse tree) by culling the AST driven by the inhaled text (commonly known as "parsing")
Term rewriter app walks the rewrite AST using the newly-created CST.
Term rewriter app unparses the rewritten walked-tree into (new) linear text ("code").
Run the generated code.

If that's even close to what you're imagining, then I contend that this is "easily possible" to do using modern technology whilst loosening the thumbscrews, i.e. using OhmJS plus a nano-DSL written in OhmJS. I call it "t2t" (text-to-text transpilation) and am actively experimenting with it and its implications (meta-syntaxes, metaprogramming, etc.). More info/blather/discussion if interested.

Konrad Hinsen 2024-10-03 13:14:03

Paul Tarvydas The problem with your proposal is that it doesn't fit the way rewriting-based documents are used in practice. They are more like spreadsheets. The fundamental action of the computer is not "run a program" but "update everything after the user has made a change to the document". The point of a "current node", as I see it, would be to make it clearer to the user what exactly happens during such an update. It's more of a user interface than a programming issue.

Paul Tarvydas 2024-10-04 09:45:42

Konrad Hinsen Continuing to ponder, some half-baked thoughts:

1) "run a program" and "update everything after the user has made a change to the document" are the same thing, except maybe differing in speed ; if your machine is fast enough, there is no point in trying to optimize the update by keeping update pointers, just re-do the whole thing in one fell swoop so fast that a human user perceives them to both be the same ; you want the action(s) to "feel" like a REPL

2) possible relationship: "... current node ... a user interface than a programming issue ..." ; Lisp lists are "trees". I use a Lisp debugger (Lispworks) that lets me single step through Lisp code ("trees") in the same way that an assembler debugger lets me step through lines of code ; does this sound like "current node"-iness?

Konrad Hinsen 2024-10-05 12:45:31

Paul Tarvydas I guess we first have to agree on what we mean by "program", in particular if and how it is distinct from "input data". At the bit level, there is no such distinction. Both program and input data are bit patterns in memory that the computer acts on. But in what I see as the most common usage of the term, a "program" is something long-lived that reads "input data" for one of its many execution phases. With that distinction in place, it's the rewriting engine that is the program, and all the rules and terms/graphs to be rewritten are input data. Just like Excel is the program and all of the spreadsheet, including the formulas, is input data. But if you look at it from the point of view of semantics, it's the rewrite rules that are the program. Which are changed all the time in a computational document. Describing the interaction between a human and a computer via a rewriting-based document is running generated code is not wrong, but it's not helpful either unless you are thinking about writing a rewrite rule compiler.

As for 2), that's a very nice example. But I haven't seen anything similar for rewriting. I tried building my own, but gave up because it turned out not to be that useful. I came up with different debugging tools, none of which traces the work done by the computer.

Paul Tarvydas 2024-10-06 01:09:06

Konrad Hinsen, You touch upon a good point. My feeling about "what is a program" is somehow different and I'm struggling to put it into words.

Observation / pondering: control flow is not data. Smalltalk's encapsulation of data is not sufficient to isolate control flow, like how Unix pipelines isolate control flow.

Konrad Hinsen 2024-10-06 08:38:38

Paul Tarvydas Smalltalk is an interesting case. Control flow in Smalltalk is entirely implemented in terms of the method dispatch algorithm. That's similar to rewrite engines in that it's hidden from view. All you see in the code is messages sent around.

That's somewhat unrelated to your observation that data encapsulation doesn't imply control flow encapsulation. Or maybe it is related. I am decided for now.

Paul Tarvydas 2024-10-06 10:29:39

Konrad Hinsen Aye, and there's the rub. In my books, Smalltalk does not do message-passing (!). Smalltalk does method-calling. Method-calling involves blocking. Blocking is usually implemented as a state machine that performs low-level synchronization. In my mind, message-passing is "fire and forget".

Blocking mixed with FTL (faster-than-light) rewriting ("referential transparency") works when your medium is paper, but, is overly-restrictive when your medium is reprogrammable electronic machines (aka "computers").

Continued...programmingsimplicity.substack.com/p/control-flow-is-not-data?r=1egdky

📝 Control Flow is Not Data

2024-10-06

Paul Tarvydas 2024-10-06 14:06:13

FWIW, some brainstorming, trying to get back to the original question programmingsimplicity.substack.com/p/tree-current-node-brainstorming?r=1egdky

📝 Tree Current Node Brainstorming

2024-10-06

You are viewing archived messages. Go here to search the history.

You are viewing archived messages.
Go here to search the history.