Build configurations

Abstract

The popular practice of having only two different kinds of builds (Debug and Release) is shown to be inadequate. Three to four different kinds of builds are proposed instead, allowing more thorough error checking during development, better performance of the final system on production, and potentially better performance when running tests on a build server.

(Useful pre-reading: About these papers)

The Issue

In software development we often want our creations to have different characteristics under different circumstances; for example:

Optimizations:
- While developing we usually do not want them, because they interfere with debugging.
- On the final shipped product we want them, because they make it run faster.
Preconditions, assertions, and other kinds of runtime checks:
- While developing we want them, because they help us catch bugs.
- On the final shipped product we do not want them, because they slow it down.

The History

In C and C++, different behavior has historically been achieved by means of compiler options controlling optimization, and preprocessor macros controlling conditional compilation. The standard stipulates an NDEBUG macro which, if defined, causes assertions to compile to nothing. This means that software systems written in C and C++ generally have two builds: a Debug build, for use while debugging, and a Release build, for shipping or deploying to production.

When Java came along, it was decided that a single build should be good for everyone: conditional compilation was abolished¹, all optimization-related choices were delegated to the Just-In-Time compiler (JITter,) assertions were made to always compile into the binaries, and the enableassertions switch was added to the virtual machine for controlling during runtime, rather than during compilation, whether assertions should be executing or not. This essentially gives Java developers the ability to choose between a debug run or a release run, as opposed to a debug build or a release build.

C# has brought back a compiler option for controlling optimization, and conditional compilation by means of a simplified version of the preprocessor macros (called "define constants" in C#) and the Conditional attribute. Two different kinds of builds (called Build Configurations) are predefined: Debug and Release. The build system offers great flexibility in defining additional build configurations, but C# developers rarely bother with that.

The Problem

Since developers rarely bother with defining any build configurations besides the predefined ones, the vast majority of dotnet projects use only the two predefined ones: 'Debug' and 'Release'. (Many projects actually use only 'Debug', but let us pretend we never heard of them.) Thus all different needs and usage scenarios are being shoe-horned to fit into one of those two options. For example:

There is only one configuration that can be tested, namely the Debug configuration, which means that this configuration is used not only for running tests on a developer's computer, but also for running tests on the build server.
There is only one configuration of a library that can be published, namely the 'Release' configuration, which means that this configuration is used not only in production scenarios, but also in development scenarios, where software is being developed that is making use of a published library.

This is problematic because:

It slows down test runs on build servers.
The 'Debug' configuration is unoptimized, to avoid interference with debugging; however, by common practice, the same 'Debug' configuration is used for running tests on the build server, because that is the only configuration that can be tested; thus, the world is full of build servers executing unoptimized tests, exercising unoptimized code.
If the tests and the code they are exercising are long-running and computationally expensive, lack of optimization will make them run even slower.
However, virtually nothing ever gets debugged on a build server, so there is virtually never a need to have it running unoptimized code.
It slows down the software on production.
When a library is published as a package, the configuration that gets packaged is, by common practice, the 'Release' configuration. This configuration executes preconditions, since it may be referenced by a project under development; however, at some point, that project together with the library are released to production, where the library is still executing preconditions.
This amounts to nothing but a waste of clock cycles, because:
- By the time the software using the library gets shipped to production, it has been tested and can be reasonably assumed to be invoking the library only in valid ways.
- Even if the software did happen to make invalid use of the library on production, it makes very little difference whether the resulting catastrophic failure would be signaled by a precondition failure or by some index out of range exception further down.
Many preconditions are omitted in the name of performance.
Library programmers often refrain from asserting certain preconditions, if they suspect them to be even slightly expensive, in light of the fact that preconditions in a library will always be executing, even on production.
An extreme example to illustrate this scenario is the binary search function, which should, in principle, be enforcing the precondition that the array to search must be sorted. Yes, this means guarding a O(log₂(N)) operation with a O(N) operation. This is fine during development, because we test with small amounts of data anyway, but is a terrible thing to be doing on production; thus, there is virtually no library in existence with such a precondition in it, despite the fact that it is necessary.

The Solution

From the description of the problem it becomes evident that preconditions must be controlled separately from assertions, and both of those must be controlled separately from optimizations. Therefore, four different build configurations can be thought of:

A 'Debug' configuration
Everyone is more or less already familiar with this. It is meant for use by a developer when testing and debugging software on their local computer. Assertions are enabled, preconditions are enabled, and optimizations are disabled, because they interfere with debugging.
An 'Optimized' configuration
This is the same as Debug except that optimizations are enabled. It is meant to run on the build server, where we do not usually debug, so there is no reason to be running unoptimized software. Note that this configuration is only useful for projects that suffer from long-running, computationally expensive tests; projects that do testing right, with very short and lightweight tests, are likely to see a performance degradation from this configuration, due to the additional JITting overhead ².
A 'Develop' configuration
This configuration is only applicable to libraries, not to applications. It is identical to what was previously understood as the Release configuration, where optimizations are enabled, assertions are disabled, and preconditions are enabled; however, it is only meant to be used when developing software that makes use of the library, not for shipping to production, because we do not want to be executing preconditions on production.
A 'Release' configuration
This is similar to the Develop configuration, except that preconditions are also disabled. It is the configuration which is meant for shipping to production. Note that the benefit of using this configuration is not just maximum performance on production; it is also the freedom to add as many preconditions as necessary to the library, knowing that they cost nothing on production.

Here is the feature matrix:

	Debug	Optimized	Develop	Release
Optimizations disabled	✅	⬜	⬜	⬜
Assertions enabled	✅	✅	⬜	⬜
Overflow checking	✅	✅	⬜	⬜
Preconditions enabled	✅	✅	✅	⬜
Code analysis	✅	✅	⬜	⬜

Here is an excerpt from a .csproj file implementing the above matrix, assuming that we have defined our own set of assertion functions, dependent upon an ASSERTIONS define-constant, and our own set of precondition functions, dependent upon a PRECONDITIONS define-constant.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
<Choose>
  <When Condition="'$(Configuration)'=='Debug'">
    <PropertyGroup>
      <Optimize>False</Optimize>
      <DefineConstants>$(DefineConstants);PRECONDITIONS;ASSERTIONS</DefineConstants>
      <CheckForOverflowUnderflow>True</CheckForOverflowUnderflow>
      <EnableNETAnalyzers>True</EnableNETAnalyzers>
      <DebugType>Full</DebugType>
    </PropertyGroup>
  </When>
  <When Condition="'$(Configuration)'=='Optimized'">
    <PropertyGroup>
      <Optimize>True</Optimize>
      <DefineConstants>$(DefineConstants);PRECONDITIONS;ASSERTIONS</DefineConstants>
      <CheckForOverflowUnderflow>True</CheckForOverflowUnderflow>
      <EnableNETAnalyzers>True</EnableNETAnalyzers>
      <DebugType>Full</DebugType>
      <OutputPath>bin\$(Configuration)\</OutputPath>
    </PropertyGroup>
  </When>
  <When Condition="'$(Configuration)'=='Develop'">
    <PropertyGroup>
      <Optimize>True</Optimize>
      <DefineConstants>$(DefineConstants);PRECONDITIONS</DefineConstants>
      <CheckForOverflowUnderflow>False</CheckForOverflowUnderflow>
      <EnableNETAnalyzers>False</EnableNETAnalyzers>
      <DebugType>Portable</DebugType>
      <OutputPath>bin\$(Configuration)\</OutputPath>
    </PropertyGroup>
  </When>
  <When Condition="'$(Configuration)'=='Release'">
    <PropertyGroup>
      <Optimize>True</Optimize>
      <DefineConstants>$(DefineConstants)</DefineConstants>
      <CheckForOverflowUnderflow>False</CheckForOverflowUnderflow>
      <EnableNETAnalyzers>False</EnableNETAnalyzers>
      <DebugType>Portable</DebugType>
      <Deterministic>True</Deterministic>
      <DeterministicSourcePaths>True</DeterministicSourcePaths>
    </PropertyGroup>
  </When>
  <Otherwise>
  ...
  </Otherwise>
</Choose>

If we follow this build configuration scheme, then each time we publish a library we must generate two packages: the 'Develop' package, and the 'Release' package.

The 'Develop' package is to be referenced by software under development.
The 'Release' package is to be referenced by software that is being shipped to production.

The generation of two different packages for a single library can be accomplished by building twice, once for each configuration, and constructing the assembly name as follows:

1
  <AssemblyName>$(MSBuildProjectName)-$(Configuration)</AssemblyName>

This way, instead of a single package called MyPackage we create two packages: MyPackage-Develop and MyPackage-Release.

There may be a better way to build a library, so that only one package gets generated, containing both the develop and release builds, and the right binaries somehow end up in the right output directory; however, I have not been able to figure that out yet. If you know how to do it, please let me know.

For any build configuration of a certain module, (either an application or a library,) the build configuration of the libraries it uses can be determined using the following table:

Build configuration of module using library	Build configuration of library
'Debug'	'Develop'
'Optimized'	'Develop'
'Develop'	'Develop'
'Release'	'Release'

Note that the 'Develop' configuration of a module could, in theory, make use of the better-performing 'Release' configuration of a library, instead of the 'Develop' configuration; however, that can only work if the module does not expose the library, or if there is no other module in the solution that uses the 'Develop' configuration of the library. Otherwise, there are going to be build errors saying that a certain type exists in both the develop and release configuration of a certain library.

Here is an excerpt of a .csproj file implementing the above table:

1
2
3
4
5
6
<PropertyGroup>
  <PackagesConfiguration Condition="'$(Configuration)'=='Debug'"    >Develop</PackagesConfiguration>
  <PackagesConfiguration Condition="'$(Configuration)'=='Optimized'">Develop</PackagesConfiguration>
  <PackagesConfiguration Condition="'$(Configuration)'=='Develop'"  >Develop</PackagesConfiguration>
  <PackagesConfiguration Condition="'$(Configuration)'=='Release'"  >Release</PackagesConfiguration>
</PropertyGroup>

Then, packages can be referenced as follows:

1
  <PackageReference Include="MyPackage-$(PackagesConfiguration)" Version="..." />

Conclusions

An 'Optimized' build configuration has been proposed, for cutting in half the time it takes to run slow, computationally expensive tests on build servers. (Not needed by projects with small, fast tests.)
A 'Develop' build configuration for libraries has been proposed, intended for use during development of software using the libraries, but not for shipping to production. It has preconditions enabled, in order to catch bugs in the software using the libraries.
A 'Release' build configuration for libraries has been proposed, intended for shipping to production. It improves performance by not executing preconditions.
Under the proposed schema, preconditions in libraries do not incur a performance penalty on production anymore, so programmers can apply them more liberally, leading to more robust software.
Under the proposed schema, when a library is published, two packages should be generated: the 'Develop' package, for developing software that uses the library, and the 'Release' package, for shipping to production.

Cover image generated by ChatGPT, and then retouched by michael.gr. The prompt used was: "Please give me an image conveying the concept of highly complex and highly technical software development. Make it in landscape format, of photographic quality, with warm colors" and then "Please make the programmer look more senior".

The creators of Java made it so that the generation of code within an if() statement controlled by a compile-time constant is suppressed if that constant evaluates to false, but they intentionally deprived developers from the ability to specify the value of a compile-time constant via external means, such as the command-line of the compiler. They defended this choice by saying that there is inherent merit in being able to guarantee that in Java every compilation unit has one and only one set of semantics. The usefulness of this merit is debatable. It can be argued that this is simply Java treating developers the same way that Apple has been treating users: as idiots. ↩︎
In C# most optimizations are performed by the Just-In-Time compiler (JITter), and people say that the optimizations performed by the language compiler do not make much of a difference. However, my experiments have shown otherwise: computation-intensive code tends to run twice as fast when optimizations are enabled than when not, and this difference can be observed on a build server, so it is unaffected by any optimization choices that the JITter might make due to a debugger being attached or not. I suspect that this is happening because the language compiler saves the "optimize" flag in the binary, and the JITter subsequently observes this flag. ↩︎