Got the visual design of the wiki mostly fleshed out. Fluid layout, one breakpoint that moves elements around for smaller screens, light and dark modes, nice keyboard nav, etc etc. Thoroughly documented too, since I figure some folks might want to peek at the css but aren't up on all the new weird stuff you can do.
Also, I'll be eating my hat now: everyone who scolded me for attempting to parse a markdown-like syntax with Regex, well, yeah, it kinda stinks. Not sure what to do about this, given the values I'd like to impart in this project. Since it's not a blocker on writing pages, I think I'll ship it half-finished and we can talk about it. This feels like a great area to draw on the wisdom of the crowd.
Markdown or not, we'll still need a build script, I reckon, unless we want to ~require~ that anyone adding a new page also add their page to any relevant indexes — since, I reckon, we'll need some indexes just to get through the early phase where the wiki will be sparsely interconnected. To start, I'm thinking one index of broad categories (or maybe tags, dunno), and then another that's just a list of all pages. But if anyone has strong feelings about the right way to do this, I'm at least 75% ears.
Throwing in a couple of rough ideas:
- If the spirit is simplicity and minimalism, maybe gemtext would be a better option than Markdown? I haven't looked at gemtext closely enough to say if it's that much easier to parse, but maybe someone else here has.
- If the spirit is simplicity and minimalism, then "all the new weird stuff you can do" with CSS rings an alarm bell. Does this CSS wizardry require the latest and greatest browser from Google or Mozilla,or will the site be usable with older and indie browsers?
- Indexes: the only way I see to remove the need for them is a good search engine. But maybe we want a search engine anyway, and then we could perhaps start at that end and see if we still want indexes.
- The last two points also raise the question of the preferred mode of interaction with the conversations. Is it a Web interface? Or is the Web interface merely one tool out of many? A search engine could also be a locally-run tool, or a tool running as a separate Web service.
FWIW I needed a super simple CMS and I discovered you can reference a publicly accessible google sheet with CORS enabled from the FE
const sheetId = '1Z7Dja43FepxVOJc5_pMdP0etERM6h0BPAWT74zjdbno';
const sheetName = 'Sheet1';
const url = `[docs.google.com/spreadsheets/d/${sheetId}/gviz/tq?tqx=out:csv&sheet=${sheetName}](https://docs.google.com/spreadsheets/d/${sheetId}/gviz/tq?tqx=out:csv&sheet=${sheetName})`;
return d3.csv(url)
Could you push the build step into a google sheet?
e.g. author write blog post and uploads.
Author views google sheet and adds tags => various downstream sheets update.
Website viewer goes to page ?tag=foo
page fetches sheet for that tag and has a list of articles in a JSON.
You will still need JS thought to unroll that JSON to renderable content, which may be against a design goal (?). Where is the feature matrix? I have been keen to understand this project more but I have not seen it on the slack yet
- If the spirit is simplicity and minimalism, then “all the new weird stuff you can do” with CSS rings an alarm bell. Does this CSS wizardry require the latest and greatest browser from Google or Mozilla,or will the site be usable with older and indie browsers?
I’d encourage the group not to worry about this too much, at least not until we have evidence that something is broken. CSS, similar to HTML, tends to degrade relatively gracefully, and it is easy to write fallback styles for. If the latest and greatest doesn’t work, something older can step in to clean it up.
- Gemtext: doesn't ring a bell, so I'll definitely take a look at it. Thank you for the rec.
- CSS: I'm using features like rule nesting and custom properties, which make it faster to iterate on the design. Once the dust has settled, I can spend a few cycles making sure everything degrades gracefully — this adds bulk and cyclomatic complexity, so it's actually against the spirit of simplicity and minimalism, but it's worth doing to give as many folks as possible a nice experience.
- I'm going to say that a search engine that's somehow specific to this wiki is out of scope for now. We could add, say, a search box that just drops the user on duck duck go with a
site:
query, maybe. But especially when the quantity of text on the site is relatively small, search is going to be a bit shot-in-the-dark. Good to think about for down the road, sure. - Yeah! The thing I'm setting up is (A) a git repo with all the wiki entries as either html files or some sort of md-like plain text files, (B) a build script that generates nice static html pages based on the wiki entries, and (C) the means of serving those html pages with nice styling at some official URL. Anyone can clone the repo and do whatever else they want with it. I'd love to see people build alternative tooling around this corpus of wiki entries.
- I don't really want to couple anything to Google (or Amazon, etc). Everything that I've set up so far just assumes much more bog-standard / portable tech, like "a programming language with regexes" or "a web server that can serve html files". On the initial call where we hashed out this idea, one compelling thought was that by hosting the wiki git repo on GitHub, folks could use their web-based editor to write/edit wiki entries. So we are doing that, but that's just additive, not essential.
On "parsing markdown with regex", I'm sure most will know this famous StackOverflow reply:
Gemtext is definitely easier to parse. That was the whole design goal. Evidence: count the number of Gemini clients out there written in less than 200 lines of code.
The drawback of Gemtext: no inline formatting, no inline links. You can only format at line granularity.
Duncan Cragg — Yes. A classic.
Kartik Agaram (cc Konrad Hinsen) — Had a look at Gemtext. I really like it! I think it would be trivial (and sensible) to extend this syntax to support inline links and formatting (including code, using a modal processor just like you do for lines). I'm going to give it a shot and see how it goes.
Astonished?
It's simple, you just implement an FSM, and split the incoming character stream on any syntactically-relevant character, then backtrack if you hit a node in the FSM with no outbound edges.
I suspect markdown would also be really easy to parse if you refuse to implement all the stupid edge cases eg if someone writes am _I bold* or_ italic*
just replace it with an error message.
Shhhh you're spoiling my "actually I will implement a superset of gemtext" gambit to ship decent md ;)
Astonished?
Parsing is one of those funny things where the existence of lots of interesting theory and papers creates the impression that you need complicated tools, whereas in practice almost every industrial parser I've ever seen is just hand-written recursive descent because that is by far the easiest option.
Yeah. Alex Warth (known for OhmJS among other things) has been teaching a little "prototyping programming languages" course internally at I&S. One of the major themes is just that — unless you're actually building an industrial-scale compiler, most of the standard advice given about writing compilers is bunk. Just do something quick and dirty and direct and manual. It's fine. It's plenty fast, and robust.
So it's funny to hear that even for the industrial stuff, most of the sophisticated theory doesn't apply.
I guess the one place that stuff is actually useful is when writing… textbooks about compilers.
My astonishment is about getting into notation design, not parsing. What could possibly go wrong 😄
I will be counting the days until people start complaining about the notation or asking for features.
Exactly. And that was just about how to implement a notation.
It's fine. If I'm worried about bikeshedding I shouldn't add to it 😄 :homer-backing-away:
Y'all just got me to de-emoji myself by clubbing me with the unwashed masses who haven't written a recursive descent parser or three 😅
(Parsing has its depths. It's hard to make sure all possible illegal statements behave well and raise errors. Favorite target of fuzzers! Probably less important in a safe language, but be prepared for the odd DoS vulnerability caused by an infinite loop in a year or 3.)
Going back, this is a good suggestion:
I suspect markdown would also be really easy to parse if you refuse to implement all the stupid edge cases eg if someone writes
am _I bold* or_ italic*
just replace it with an error message.
The key is doing it without eating people's comments. May be safer to just leave the whole message unformatted.
Drive-by comments and observations...
- A major cause of difficulty in parsing is: ASCII. ASCII provides so few characters that we are inclined to overload their meaning. For example “begin string” and “end string” are both represented by the same quote characters in traditional programming languages. Parentheses can bracket expressions or invoke functions or define arg lists in function definitions. Unicode, though, has tons of characters, hence I, for one, have no compunction about choosing a single, non-overloaded meaning for specific characters, and, I have no compunction about choosing left and right bracketing characters for each different kind of bracketing that I feel is needed. The concept of nested comments seemed revolutionary in the past, but, would be natural using non-overloaded characters. Likewise, nested strings become easy to imagine. (Note that the Unix program “M4” does define different begin and end quote characters for some of its strings).
- Recursive descent + backtracking = PEG. Recursive descent does better than CFG and REGEX for practical parsing, with the exception that you have to manually pre-refactor the grammar to fold together all common prefixes (left hand side of phrases). PEG adds the nuance that it can try to parse a phrase, and, if the attempt fails, PEG backs up and retries some other parse branch. With backtracking, manual common prefix refactoring is no longer necessary (the machine does it for you). Backtracking was well-known early on, but, frowned upon by those with 1950s biases. Early’s parser and PROLOG were side-stepped for “more efficient”, gotcha-full approaches. Backtracking ain’t all that hard if you ignore your inner 1950s biases. [I even have a JS backtracker (a Prolog) lying around, only lightly tested, on my repo somewhere, generated by my first use of OhmJS. I transpiled Nils Holm’s Scheme program to JS. If I can do it, anyone can.].
I like the idea of building on gemtext. I see a simple specification as a necessary foundation for simple implementations. So: make it as simple as you can. If gemtext goes too far, then add what's missing to gemtext.
There is, of course, the risk of "oh, it's our own notation, so we can do all these wonderful things with it". Followed by extensive bikeshedding sessions. Here's an idea for dealing with it: (1) Whoever proposes a modification to the notation has to supply working code that implements the change. We won't even discuss unimplemented ideas. (2) We'll set a limit to allowed code size. Something like "no more than 1.2 times the size of Ivan's first working version, for at least one year".