David Alan Hjelle 2023-10-10 15:34:22 I was curious this morning: my naïve view of compiler history is that they used to be very small (due to performance constraints) and have gotten very complicated over the years in order to support multiple platforms and in order to employ more and more optimizations. Is that generally a fair take? What are the big changes to compiler architectures from the early days? Does something like LLVM produce enough better code to justify its complexity? Are there any blog post- or paper-length history of compilers articles out there? (I'm not quite so curious as to be ready to read a whole book, but if you've got a good recommendation…)
Mark Dewing 2023-10-10 21:05:00 One change is that companies don't need to build their own complete compiler anymore. Previously, each company might buy a front end (e.g. EDG) for parsing, but the rest of the compiler they had to build in-house (or go complete open and extend gcc). In addition to hardware performance constraints, the compiler size (and complexity) was limited to what size compiler team a company was willing to fund. With LLVM, companies can focus on the pieces specific to their needs.
Justin Blank 2023-10-11 09:12:39 The question “is it worth it” is pretty hard to answer. There can be different assessments of the cost of LLVM being such a big project, and there are debates about how much the optimizations in LLVM matter, compared to a smaller compiler (see the discussion around Daniel Bernstein’s “the death of optimizing compilers”).
Throat clearing done, Bernstein is wrong, and the answer is “yes, it’s worth it.”
Konrad Hinsen 2023-10-11 14:49:16 A question that came up in a discussion this morning:
Suppose you want to publish a command-line utility program, meant to be easy to use. Doing Web retrieval and some post-processing. Around 500 lines in a typical scripting language, but with dependencies (in that language plus C libraries).
It looks like packaging such a tool for all popular platforms (i.e. package managers) will be a lot more work than actually writing the code.
True? Any way to avoid this?
Ideas so far: don't use a scripting language, but something with a compiler that can produce portable executables for every major platform. Recommendations in that category so far: Go, Rust, Racket, Common Lisp. I have doubts that all of these can handle "plus C libraries", but it's a start.
Does anyone here have actual experience with this kind of project?
Tom Lieber 2023-10-11 15:02:23 Do you just want a package, or do you wanted it listed in all the registries for installation by name?
Chris Knott 2023-10-11 16:58:36 I think runnable fat .jar is probably the best bet, like the Apache Tika standalone app. It has about a million dependencies under the hood but the user experience is just to download a single executable that runs on all platforms (that have java)
Konrad Hinsen 2023-10-12 06:44:40 Tudor Girba Briefly. I have never seen a command-line tool written in Pharo in the wild, so I can only speculate how that would be distributed. For plain Pharo code, I imagine a tarball containing the VM and the image, plus a script that takes care of running the VM with the right parameters to find the image. The obstacle I see is the C dependencies. For macOS and Windows, I could add the binaries to the tarball. For Linux, there is too much heterogeneity to distribute binaries, so I'd have to find a way to make the Pharo code find the libraries wherever the system package manager has put them. And that's something that only the package manager knows for sure.
Scientific computing environments are particularly challenging because "Linux" really means "anything with a Linux kernel". There's high-performance computing centres running prehistoric CentOS versions ("we prefer known bugs to unknown bugs"), nerd laptops with the latest exotic distribution, and in between more standard installations such as Debian or Ubuntu, in any versions from ten years ago to bleeding edge.
Jack Rusher 2023-10-12 12:06:10 There are a thousand things to optimize over in this question, but this is one way.
Konrad Hinsen 2023-10-12 13:32:48 That looks nice, thanks!
It doesn't mention C dependencies though...
Jack Rusher 2023-10-12 14:26:51 There’s FFI, but you still have the problem of whether the libs you want to use are installed. The maximally safe case is to static link everything into a platform specific binary using, zB go
.
Konrad Hinsen 2023-10-12 16:47:53 Jack Rusher It's either that, or integration of my code into whatever build system the target platform uses. Both options require a lot of overhead effort.
@Arcade Wise Found it (justine.lol/cosmopolitan) - that's amazing! Not sure I'd be willing to write my code in C in order to use it, but it's pretty cool :-)
Arcade Wise 2023-10-12 16:48:35 Yeah! It’s wild. I wonder how hard it would be to make a language that compiles to and has bindings for cosmo C
Konrad Hinsen 2023-10-12 16:51:37 Not so much a language but a toolchain, right? Any language that can be compiled to C should be adaptable rather easily. That includes C++, Fortran, Scheme, Common Lisp, and probably many others.
Konrad Hinsen 2023-10-14 12:16:44 But is it possible to create stand-alone binaries? If users have to install node, installation is too complicated.