For some time I've thought that type systems are essentially ad-hoc logic languages. So.. why not use a full on logic language (prolog or something) to statically analyze your codebase instead of keep patching onto a type system, it appears to me that logic languages are simpler & more expressive than most type systems. I'm starting to work on this for my language, and creating a logic language in efforts to also understand logic programming better.
Another advantage I can think of is that if a the logic language is expressive enough to type even the most advanced languages as well as the weakest, it could be used as a shared interface between languages to understand at least parts of each others type systems, as well as be a tool for user extension for additional static analysis specific to their project.
I'm basically thinking something like this.
# here's a sample program in my (object oriented) language
fn add(a, b) = a + b
add(1, 2)
generated logic program in pseudo syntax ( #A
is a pattern to check value is type of A
, ?a
is like prolog's A
, {...}
is map literal.)
-- preloaded facts
lookup(#Number, :+, {args: [#Number, #Number], ret: #Number}).
-- program
add(?a, ?b, ?ret) :- lookup(?a, :+, {args: [?a, ?b], ret: ?ret}).
?- add(1, 2, ?_).
The specifics of my logic language is likely flawed, but I'm curious about the general idea of using a fully-fledged logic language to generate a program in & run instead of generated a typed AST and running ad-hoc type systems.
My question is, what are people's thoughts on this? Any previous work in this area? Any obvious flaws?
Massive agreement. Wish: I wish that type-checking was built BESIDE languages instead of INSIDE-OF languages. Observation: While sitting through a reading of PLFA (Agda) [with CompSci Cabal], I developed the distinct feeling that I was looking at 2 languages, one for programming a type system, and, one for programming an implementation (an after-thought in Agda, IMO) ; to me, “Dependent Types” is just more-programming-language-constructs for a type language (DT adds parameters) ; Agda, et al, seems to be inching towards the re-discovery that type systems can be / should be programmed. The benefit of Relational Languages (PROLOG, miniKanren, etc). is that they specify “what is wanted” (relations) instead of “how to create steps for achieving what is wanted” (3GLs like Python, Rust, Haskell, etc.)
To me type systems are in the awkward position of trying to squeeze into an existing language (get out of the way), but also be powerful enough to express complex things like DSLs.
Type systems have to all sorts of optimizations to make unification manageable or querying the codebase.
Where it seems to me that these are solved problems - databases.
I imagine such a logic system that learns from the many decades of high performance databases, could be very fast (think indexes, query caching, materialized views etc.).
Rust embeds a datalog system in its type checker for borrow checking, if I’m not mistaken. github.com/rust-lang/datafrog
The only example of a language with a "sidecar" type checker is Shen (shenlanguage.org). It's a Lisp that includes a logic engine (roughly Prolog with lispy syntax), which is used to implement the type checker that can be turned on and off as preferred. And the type checker is just another Shen program, so nothing (except the effort) prevents you from substituting your own.
Shen Programming Language
I'd also love to see static checkers outside of languages, allowing multiple checkers and in particular domain-specific ones. Ideally, I'd build my own checker as my understanding of my problem domain evolves.
I somewhat suspect that the traditional tight integration of type systems and languages comes from the benefit that type checking brings to optimization.
Yea adding project specific rules was the original inspiration that got me thinking beyond conventional type systems.
linters can work, but its duplicating all the work of the original type system, and usually don't even do proper type checking... and they usually have their own build step etc.
What if my additional rules could be understood by a common LSP-like system?
I somewhat suspect that the traditional tight integration of type systems and languages comes from the benefit that type checking brings to optimization.
There's no reason why the database generated from the logic language can't stick around to be queried for optimizations, right?
One thing that I imagine will require some work is interpreting errors (no answers found) for the results of running these programs. I'm imagining when this logic program runs it'll have to keep track of which queries / predicates succeeded & which didn't - and something to make something like a stacktrace, maybe even track that in a logic db that can also be queried to emit domain specific (nice) error messages.
I think the big idea is having a database that contains really detailed statically analyzed facts which can be used to check correctness, do optimizations, or even help with automated refactoring tools. So it might not even need to be 1 logic language, but a shared database system which all the logic languages can speak & store facts & query from.
There's no reason why the database generated from the logic language can't stick around to be queried for optimizations, right?
Technically, no. Socially, it means that the generation of important information is no longer under control of the compiler writer, and even language designer. They would have to cooperate with the users of their language and tool, and negotiate protocols for information exchange. It's more convenient to be a dictator.
It's curious that value types are so favored among all the useful summary/analysis operations we can do on programming languages. I mean similar techniques can track references, side-effects, entity life-cycles, resource usage, even evaluation of the language itself.
One reason types are built inside languages is undecidability. We tend to want the type checker to always terminate, and to err on the side of soundness rather than completeness. This is why the type language is weaker, whereas the "main" language can be Turing-complete.
I'd reverse the causality of that argument. It's because the type checker is part of the compiler that type checking has to be decidable and prioritize soundness over completeness (and lots of other nice properties we don't get, such as fitness for the application domain).
I don't know if I understand all the arguments against turing complete type systems. Could someone spell it out for me?
Is it hard that it'd be hard to debug infinite recursion in type systems? Do you worry people will misuse it to write fancy compile-time programs that will delay builds?
My second point seems to be possible in turing-incomplete languages too.
Something that I think would be unique to bringing in something like prolog is you could use its time travelling debugger to debug especially tricky types. I'd argue against tricky types/constraints in general, but when you're stuck with a library that goes wild with types you could at least dig in vs just getting a dump of useless text.
@Marcelle Rusu (they/them) The main issue with a Turing-complete type system is that the compiler might get stuck in an endless loop.
Turing completeness is indeed always a mixed blessing. But the problems matter more than the benefits at "meta" levels of scaffolding around a program. If you could get into an infinite loop before you even start running your program, the UX is much worse.
Hmm, going back to the original question it might work with some additional tooling. We typically have just a run command, but if the compiler supported running just the parser, or just up to the typechecker, maybe the UX becomes a non-issue. Yes, the programmer now needs to become aware of the internals of the compiler to some extent. But it might be a good trade-off.