Funclerative Programming
Hello World! I have been absent for a very long time, I was busy with
another writing project that should come to a close towards the end of
this year. From then on I will come back to writing more stuff here.
Today I want to discuss the approach I’ve used to build the last couple of DSLs with my current customers. It relies on the idea of “funclerative” programming, mixing functional and declarative. It relies on KernelF, a functional base language we’ve developed for MPS. KernelF is described in this rather long paper.
First-class structures or expressions?
One of the main tradeoffs when developing DSLs is to decide on the amount of first-class domain abstractions to use. Representing lots of aspects first class is useful for several reasons. First, the semantics of the program is easy to analyze because the structures and relationships of language constructs is directly encoded in the domain-specific AST of the language. Correspondingly, code completion is very precise and error messages can be phrased in a way that is very close to the domain abstractions. However, the approach also has an important disadvantage: because the structure is so specific to the domain, every change to that structure requires the migration of existing models. While MPS has migration facilities, this nonetheless creates a lot of friction, especially in the early stages of language development where the understanding of the domain evolves minute by minute and the language changes accordingly.
Functional programming is the opposite. Except for a few top-level declarations (structs, functions, constants, enums) everything is an expression. This highly orthogonal structure means that you can more or less nest everything under everything. The type checker might complain, but from a structural perspective, “everything” is an expression. So if you extend or change the language, it is very likely that no structural changes — and hence, no migrations — are required. Very nice, very flexible! There is a drawback, of course: the user experience, at least for domain experts who are not professional programmers, becomes worse: code completion always shows a lot of stuff (because everything is structurally an expression) and, because domain semantics is harder to analyze from a structurally more flexible program, error messages then to become less meaningful.
So which approach do you use? Since the customer is king, or, phrased more seriously, it’s not useful if the language structure is convenient for the language engineer but the UX sucks for the domain user, the option of going functional is not realistic. So you’re forced to go with option one, with all its drawbacks? Well, not really.
Adding domain-specific structures
I usually start with KernelF. It is a full functional programming language. I briefly demonstrate this to the developers and domain experts I collaborate with at my customer. I also show them how we can directly run KernelF programs through interpreted test cases and this way can get direct feedback on the correctness of our code.
At the time when we start prototyping, we have usually already done a little bit of domain analysis, so we agree on some of the main abstractions of the domain. These are often structural: contracts, calculation trees, particular data structures or types. So we add those as structural/declarative abstractions, often as KernelF top-level elements (so you can write them right next to functions, structs or enums). Inside those new structures we use of course the functional/expression parts of KernelF: the built-in types, and all the arithmetic or comparative expressions. I usually also immediately extend the testing framework, for example with an expression to instantiate those domain-specific data structures to “invoke” a calculation tree. After a couple of minutes we can execute programs that involve the new domain-specific structures,
Restricting KernelF
However, because we use the rather generic Type
and Expression
concepts from KernelF in many places, code completion is full of “weird” stuff in the opinion of the domain user. Some of these things are genuinely unnecessary for the domain. For example, several DSLs built on top of KernelF do not use option types. Or the built-in error types. We then use MPS’ can be ancestor
constraints to prevent those from being visible to the users in code completion. This is really important: because this constraint prevents users from entering these concepts — by hiding them from the code completion menu — they are effectively removed from the (user’s perception of the) language. Some of the “weird” stuff is necessary for the domain, but the syntax is problematic. For example, higher-order functions with their embedded lambda expressions are usually a not acceptable, even though the functionality to filter, transform and group collections is required in many domains. What I do in this situation is to constrain out the default higher-order stuff but then replace it with a more friendly (and domain-adapted) version.
Importantly, I make all of these changes to the (user’s perception of the) language without changing the overall structure! Everything is still made from Type
and Expression
. This is the crucial point, for several reasons, which we discuss next.
Extension and Composition
First, I still have the full power of KernelF available outside the domain-specific structures that use the can be ancestor
constraints to restrict the language. For example, I can write helper functions that use all of KernelF. At least in the short term this is often useful to get something to run, even though the syntax and the notion of functions might not survive to the final version of the DSL.
Second, as users become more familiar with the notion of a DSL and see the actual need for more expressiveness, their objections to some of the “weird” stuff goes away. I can then simply make the constraints more permissive to “reintroduce” concepts from KernelF that I had previously constrained. No structural change though, no migration necessary. Makes things very easy.
There’s a third reason why this is cool: many of the existing KernelF extensions — for temporal types, currencies, date and time or rational numbers — are basically just new kinds of Type
s and Expression
s, plus the occiasional declaration. In many cases these can then be used “as is” in the customer’s DSL. There’s nothing like replying “well, let’s see, we already have something …” when the customer asks for some of these extensions.
Of course, sometimes a particular DSL will need domain-specific abstractions on top of those extensions. For example, at one current customer we have the notion of a {monthly}
function. It is implicitly executes its body 12 times, once for each month of a year. If you access a temporal value from within such a function, the temporal value must be automatically reduced to a single value using a reduction strategy defined in the data type. Sounds like gobbledigook without more context, I know, but the point is that such special treatment can still be built “around” the reused existing extensions.
Wrap Up
So, to summarize: use a functional language as much as possible. Use constraints to “simplify” it for your users instead of building separate structures because the constraining can be undone as users become more proficient, and it allows the modular composition of existing extensions.