The Evolution of Decision Tables
A little study in language design
Many domain experts like tables as a way to representing decisions and calculations. They perceive it to be more readable than “code”. The popularity of Excel is certainly at least partly due to this effect. The OMG’s DMN also uses tables extensively for representing decisions (so-called decision tables). There are many forms of decision tables, and we have been using them in our DSLs for a long time.
Their expressiveness has evolved quite a bit, though. In this article I recount the evolution of decision tables based on inputs from our users. It is a nice example of language design, and a good illustration of why it is useful to be able to evolve the language when you are dealing with business people.
Basic two-dimensional Decision Tables
Years ago, we had implemented the basic two-dimensional decision table in mbeddr, our extensible version of C optimized for embedded programming:
This table is an expression, so you can use it wherever C expects an expression. And it has a set of Boolean conditions as row and column headers. The conditions themselves are kinda independent, but usually, every dimension addressed one variable (spd
vs. alt
). So essentially, this table represents a set of nested if
statement over two criteria.
We have since implemented the exact same structure for KernelF, the functional base language we use as the basis of business DSLs. Also, in a customer project in the healthcare domain, we have collaboratively built a version of this table that explicitly concerns two numeric variables, which made the syntax much less cluttered:
Generalized N-Criteria Table
This form of decision tables gives up on arranging the two criteria along the two table dimensions, and instead lists them as separate columns. Here is a simple example that calculates some kind of base fare based on the state
(an enumeration type) the customer lives in and whether she is a member of some kind of club
(a Boolean):
The tables are evaluated top-down (so more specific criteria have to be mentioned further up), and an empty field means “don’t care”. So the table calculates a fare of 1.00
if you live in BW
, the fare in BY
depends on whether you are in the club
, and everybody else pays 1.20
.
Multiple Return Values
Turns out that often there is more than one return value that depends on the criteria. So we allow multiple result columns which are represented as a tuple value in terms of KernelF’s type system, as the explicit return type of the function below illustrates:
Inline Alternatives
The columns in a decision table are joined by logical “and”. However, sometimes you also want to express an “or”. For example, the 15%-discounting might apply to Bavaria (BY
) and Hesse (HE
). We’ve implemented the comma operator to express this (note that you cannot directly use the existing ||
operator because the two enum literals are not Booleans; we could have overloaded the typing rules though):
Ranges
In all examples so far you don’t specify a comparison operator; the table implicitly compares with “equals”. However, as we have seen in the medical example above, we often want to compare ranges for numerical values. For example, the base fare could change for children in Baden-Württemberg (BW
):
Note that these things in the age
column are not really complete expressions, they take the column value as an implicit argument. However, it is important to be able to write it this way and not mention the value (age
) every time because that would increase verbosity and error-proneness.
Embedding the Tables into Context
In the examples we have seen so far, the tables are used as expressions. Since KernelF is a functional language, expressions are everywhere, and making tables expressions makes allows their use almost everywhere.
Top Level Tables
Very often, though, the tables are the only expression in a function. This works, of course, but if you look closely, there’s a bit of duplication: the column headers refer back to the argument. This is why we have provided a top-level version of decision tables where this duplication is avoided because the query columns act as parameter declarations (so they have to specify a type now):
It is surprising how much difference this makes to many domain experts — they don’t have to explicitly understand the concept of a function (even though, semantically, the table of course is one).
Tuple Assignment
We use the decision tables in a DSL where we have function-like calculations, but the result data structure is populated by assigning to its members. Here is how the decision tables can be used in this context:
We first assign the (tuple) value returned by the decision table to a local value f
(which is inferred to be of type [Currency, Percentage
, a tuple type) and we then use the tuple’s native position-based indexing to assign to the result fields (base
and discount
are members of the Fare
record). Again: this works, but it is verbose because of the intermediate value f
and the subsequent position-based assignment. To solve this issue we support assignment to tuple values if all of the elements of the tuple can be assigned to (i.e., are lvalues):
One problem remains, though: the names and types of the result columns are kinda redundant, because both can be automatically derived from the values we assign to. So why don’t we just put the assignment targets into the result columns?
Now, this is a nice compact notation. Our customers really liked that one. And justifiably so!
It’s still all Expressions!
Despite the specific notation, we are still in the context of a a “full” functional language. For example, you can have more complex expressions in the conditions …
… and you can use local values to factor out more complex calculations (there are literally refactoring operations to extract local values):
We feel that this really is combining the best of both worlds: expressive programming and “declarative”, table-based decision making.
Collaboration with our Customers
So why and how did this evolution happen? Basically, it is because of our customers. They get the point about language engineering in the sense that they drive us to create less verbose and more end-user friendly notations. It is our job then, as language engineers, to find solutions that satisfy the users, but also retain the integrity of the language (both KernelF and the DSL we are developing for them) in terms of orthogonality, composability and modular implementation. For example, it is perfectly ok to build special support into decision tables so they can “assign to” lvalues, but we don’t want the lvalues to know something about the decision tables. Sometimes this means that we cannot (don’t want to) implement our customer’s wishes to 100%.
It is a really good collaboration if we are willing to take the customers’ needs and wishes seriously, but they also understand our concerns about the language design and implementation (and the resulting slightly lower than 100% wish-fulfilment rate :-)).