Catala-specific questions

When to choose condition, rule vs booleans?

You may have noticed the keywords condition and rule in Catala scopes, for instance:

declaration scope Foo:
  input i content integer
  output x condition

scope Foo:
  rule x under condition i = 42 consequence fulfilled

The above is strictly equivalent to the following program:

```catala
declaration scope Foo:
  input i content integer
  output x content boolean

scope Foo:
  definition x equals false

  exception definition x under condition i = 42 consequence equals true

As the example shows, condition is a syntactic sugar for declaring a boolean scope variable whose default value is false. In the body of the scope, you must use rule instead of definition for defining under which conditions the condition should be fulfilled (true) or not fulfilled (false). It is possible to use exception and label on rules like for definitions, but all the rules are implicitly exceptions of a base case where the condition is false.

This behavior for condition and rule matches the legal intuition, making this syntax easier to read for pieces of programs with complex code to set a boolean variable.

Why do I have to cast values?

Some programming languages, like Javascript, do not make any distinction between decimals and integers (there is a unique Number type). In others, like Python, the distinction is hidden because the compiler or interpreter inserts implicit casts whenever you use an integer when a decimal was needed. This approach eases programming as you do not need to worry how the number is represented in memory, things just work.

However, this approach has a downside, precisely because the language decides for you how the number is represented in memory. The downside is that you are not in control of how precise the computations are, and how values are casted from one representation to another. For instance, when casting a decimal to an integer, you will lose precision because of rounding or truncating; there are multiple ways to convert a number of months into a number of days depending on what you are computing.

The philosophy of Catala is to give you full control over those choices, at the expense of require explicit casting. Hence, Catala's base types (boolean, integer, decimal, money, date, duration) strictly distinct and require explicit casting between them. Using a decimal where an integer is needed will yield a type error like the following:

┌─[ERROR]─
│
│  I don't know how to apply operator + on types integer and decimal
│
├─➤ example.catala_en:
│    │
│ 13 │   1 + 2.0
│    │   ‾‾‾‾‾‾‾
│
│ Type integer coming from expression:
├─➤ example.catala_en:
│    │
│ 13 │  1 + 2.0
│    │  ‾
│
│ Type decimal coming from expression:
├─➤ example.catala_en:
│    │
│ 13 │   1 + 2.0
│    │       ‾‾‾
└─

This error can be fixed by tweaking 2.0 to integer of 2.0. See the "Literals" section of the language reference for more details about how to create literals with the correct type.

Why is there a distinct money type?

Correctly performing financial computations is hard. The precision and rounding rules required may vary from application to application, and should be balanced with performance requirements.

This is why Catala separates strictly the money type from integers or decimals. Using money values in Catala, along with explicit casting (see above), lets the compiler warn you when you're mixing money and non-money numbers in your computation. Money, and the currency unit, becomes like a dimensional unit in a physical formula that needs to check out coherently.

Once that we have these separate money value, we have to give them a behavior that accommodates most uses and respects the philosophy of the language. Hence, money values in Catala are a integer number of cents. Multiplying a money value by a decimal can yield a value that is not an exact number of cent; in that case Catala rounds the result to the nearest cent.

If you want more precision for values representing money amount, you should represent them as decimal and cast them in (with the occasional rounding) and out of money when you need to.

How to round money up or down to a specific precision?

In Catala, monetary values are represented as an integer number of cents (see above). A calculation with the catala money type always result in an amount rounded to the nearest cent. This means, that, when performing intermediate computations on money, rounding must be considered by the programmer at each step. This aims at making review by domain experts easier, since for each intermediate value, they can follow along and perform example computations with a simple desk calculator.

To round to the nearest monetary unit, use round of. To round an amount to an arbitrary multiple of a cent, you can perform a multiplication with that multiple, round the amount and divide it by the same amount. For instance, (round of $4.13 * 10.) / 10. = $4.10. To round up or down, add or subtract half a unit before performing the computation and rounding. For instance, to round down to a cent the result of a multiplication between a money value and a decimal, you have to cast to decimal, subtract half a cent, and round back to the nearest cent by casting again to money: `money of ((decimal of $149.26)

  • 0.5% - 0.005) = $0.74and not0.75$`. This is a bit of a mouthful, but can be adapted to any desired rounding rule. Encapsulate these computation tidbits inside a global function to reuse them across your codebase. See the language reference for more details.

This technique can also be reused for decimal values that require rounding up or down to a specific precision.

Why mathematical integers and decimals instead of machine integers and floats?

Precision! Machine integers have a maximum and minimum value and wrap on overflow or underflow. Floating-point values cannot represent arbitrary small intervals between numbers and lose precision by accumulating errors, computation after computation. These weaknesses are usually ignored by computer scientists, as machine integers and floating points are precise enough for most applications, but financial computations for automatic administrative decision-making should not fail, even rarely, due to these low-level problems.

Hence, Catala uses the GMP library to feature true mathematically sound integers and decimal values whose representation in memory grows as more and more precision is needed from them. This choice adds some performance overhead but GMP includes state-of-the-art optimizations tailored for every architecture using assembly tricks to lower or even cancel this overhead for computations that don't really require the extra precision.

For instance and under the hood, Catala's decimal are actually GMP rationals, irreducible fractions made of two GMP infinite-precision integers.

How to create dates and durations from integers?

To get a duration, simply multiply the desired duration unit by the integer or decimal:

1 month * 24 = 24 month

declaration duration_of_days content duration
  depends on number_of_days content integer
  equals
    1 day * number_of_days

However, you cannot build a YYYY-MM−DD by directly concatenating together the integer values of YYYY-MM-DD. Instead, convert the integer values to durations, and add the durations to a starting date:

declaration date_of_YMD content date depends on
  year_number content integer,
  month_number content integer,
  day_number content integer
  equals
    |0000-01-01| +
      1 year * year_number +
      1 month * (month_number - 1) +
      1 day * (day_number - 1)

Using this helper function helps you avoid building invalid dates; for instance date_of_YMD of 2025,4,31 = |2025-01-01| because there are only 30 days in April.

Why are there no strings?

The absence of strings in Catala is a feature, not a bug. Catala is meant to be a domain-specific programming language for computations described in legal texts, that lawyers understand. If you find a legal text that requires actual string manipulation operations to be automated, please tell the Catala team! In absence of such a legal text, the decision was made to not include strings, for several reasons.

First, the common operations present in legal texts that can be done with strings, can also be done better with other Catala features. For instance, it is better to represent tags and codes with enumerations that can contain payload and have a built-in exhaustiveness check in pattern matching. We thus advise to really think your problem through and see whether it really requires strings as a first-class value type in Catala to be solved.

Second, the preferred way of performing low-level, computation-intensive operations not described by legal text but used in a Catala program is to simply to them outside of Catala and provide their output as inputs of a Catala scope, or define an external module. See the language reference for more details.

Third, including string manipulations in the Catala runtime will heavily increase the size and complexity of the runtime, as it will probably require a fully-fledged regexp library as a dependency. Moreover, this regexp library dependency should be available in every backend programming language that Catala supports, to ensure that the semantics of string operations is absolutely the same whatever the backend. This is a lot of work and later, maintenance, for the Catala team.

How do I add an exception from outside a scope?

Sometimes, the law is quite convoluted. For instance, article 1731 bis of the French tax code describes how to compute fines for tax fraud or late income declaration. This article specifies that, when computing the fine amount, the main tax computation should be tweaked to neutralize certain deductions. If you were to implement this in Catala, you would have two scopes FinesComputation and IncomeTaxComputation; article 1731 bis requires you to call IncomeTaxComputation from FinesComputation while tweaking certain computation rules inside IncomeTaxComputation.

This pattern amounts to declaring an exception to a variable of IncomeTaxComputation, from the outside of IncomeTaxComputation. Turns out there is a specific Catala feature to handle this case, extending the exceptions in a principled way across scopes : context variables. See the language reference for more details.

Do I have to repeat every field in a struct when I want to only change one of them?

No! See "Updating structs" in the language reference for more details.

How are dates and durations handled?

What is the result of Jan 31st + 1 month? Is it Feb 28th, Feb 29th or March 1st? This question reveals the subtle tricks behind date computations in the Gregorian calendar. The variable number of days in a month and leap years cause ambiguities in many date computations specified in the law. The way these ambiguities are resolved influences the outcome of administrative automated decisions, which is why we have been very cautious in Catala about this topic. Our motivations and design choices are outlined in a scientific article; in summary we had to implement a custom date computation library that lets the user choose how to round ambiguous dates computations.

Otherwise, dates in Catala are standard dates in the Gregorian calendar, precise to the day (and not more). Durations are a combination of a number of days, months and/or years. See the language reference for more details.

Which programming languages can Catala target?

The Catala compiler natively targets :

  • C (standard C89);
  • Python;
  • Java;
  • OCaml;

From the OCaml backend, Javascript can be targeted through js_of_ocaml.

The runtimes for these backends have two dependencies outside the standard library of the target programming languages: GMP for multi-precision arithmetic and our custom date computation library.