Skip to content

⚙️ From Source Code to Applications

Context and Objectives

🎯 What: advices and best practices for building your software within the framework of reproducible research.

Warning

By "building your software", we mean the entire pipeline or process from a set of source files to a single executable or a complete application. We consider all types of programming languages, from compiled languages like C++ and Fortran to interactive ones like R and Python. However, most of these practices are mainly relevant for compiled languages.

Why: softwares must be build consistently in transparent and repeatable ways to ensure reliable and reproducible results.

  • To ensure that your software can be built consistently from source code to executable or application (this may concern compiled or interpreted languages)
  • To provide a reliable toolchain to (try to) guarantee that others can reproduce this build process and, consequently, reproduce the scientific results generated by the software.

👥 Audience: any collaborators who contribute to the developement of a software used in scientific workflows.

🏁 Prerequisites

💡 Advices and best Practices

Automate as much as possible everything in the building chain (the classical 'configure-build-install' process, followed by the test and package).

Try to provide a "one-command" build

Ensure the process is not dependent on manual steps.

Avoid relying local configurations, try to be able to reproduce the build in different contexts

Script. All steps must be scripted (saved in files) and versioned.

Version Everything not only the source code (doc, script used to build, test ...)

Document clearly.

Provide clear instructions (in README files, documentation site, ...) describing

  • Prerequisites, dependencies and environment setup
  • How to build, test, and run the software.

Test/Validate/Verify

Validation is critical to ensure that your software behaves as expected. While setting up tests can be tedious, it is indispensable and will save time in the long run by catching bugs early and ensuring the stability of your software.

  • test often and early (do not wait to implement tests, it will be much more painful!)
  • add automated tests

Manage Dependencies and capture environments

Pin exact versions needed to handle the build chain, provide recipes to reproduce your build environment (Docker, guix, requirements.txt ...)

Use Dedicated Tools There are specialize, efficient tools for building your software.

👉 Use these tools to avoid reinventing the wheel. Each language has its own configure/build tools. Find them, learn them and use them!

Containerization forces you to script and automate every step of the build and deployment process. It makes your software more portable, easier to share, and consistent across different environments. It is obviously a good practice for reproducibility.

Consider packaging your software, even if it seems like extra work at first. Packaging forces an organized approach and leads to better quality, portability, and reproducibility in the long run. It’s not wasted time — it’s an investment in making your software more maintainable, and reusable.

Here is a non-exhaustive set of tools to help you building your software.

Purpose Tools / Standards Links
Build systems CMake, Meson, Autotools, Ninja CMake · Meson · GNU Autotools · Ninja
Python packaging pyproject.toml, Setuptools, Poetry PyPA pyproject · Setuptools · Poetry
Build environments Docker, Singularity, Guix, Mamba Docker · Apptainer Guix Micromamba
Continuous Integration GitLab CI/CD GitLab CI/CD
Test CTest, CDash, pytest CTest CDash pytest

Going Further