This guide describes how to setup a development environment for building and running Nextclade CLI and Nextalign CLI executables, how to contribute to Nextclade C++ code, maintain and release the CLI tools. If you are interested in Nextclade Web Application, see: "Developer's guide: Nextclade Web".
Nextclade CLI and Nextalign CLI are the executables are written in C++. The build system is based on CMake. Most of the algorithm code is separated in a separate static library CMake module. And the executable CMake modules link against the libraries. The default build scripts use Conan package manager to manage the dependencies. However this i not mandatory and you can obtain the dependencies any way you like, as long as they are discoverable by CMake, as in any CMake project.
There is a convenience Makefile in the root of the project that launches the build scripts. These scripts are used by project maintainers for the routine development and maintenance as well as by the continuous integration system.
The easiest way to start the development is to use the included docker container option, described in the next section. The same environment can, of course, be setup on a local machine, but that requires some manual steps, also described further.
-
Get docker
-
Run:
git clone --recursive https://github.com/nextstrain/nextclade cd nextclade make docker-dev
💡 The instructions below and the provided dev scripts are for convenience only and by no means are mandatory. The project is based on CMake, so if you are familiar with CMake, you don't need further instructions and can build the project as usual - just run CMake CLI or CMake GUI and point them to the root of the project.
-
Install and configure the build tools
💡 Quick install for Ubuntu (click to expand)
You can install required dependencies from
sudo apt-get install bash \ ccache \ cmake \ coreutils \ file \ gdb \ g++ \ gcc \ make \ python3 \ python3-pip \ python3-setuptools \ python3-wheel \ pip3 install --user --upgrade conan cppcheck💡 Quick install for macOS (click to expand)
You need to install XCode command line tools. After that you can install remaining required dependencies using Homebrew and pip
xcode-select --install brew install ccache cmake coreutils python pip3 install --user --upgrade conan cppcheck
-
Required:
-
Recommended:
-
ccache for faster rebuilds
-
gdb to automatically run the executables under debugger and show stack traces and other useful information in case of crashes
-
nodemon for watch & rebuild feature, for better developer experience and productivity
⚠️ nodemon requires Node.js and npm💡 If you don't want to install Node.js and nodemon, or don't want the automatic watch & rebuild feature, you can use
make dev-nowatchinstead ofmake devduring development (see below). -
clang-tidy for static analysis. It is recommended to use an text editor or an IDE with clang-tidy support
-
-
Clone, run and develop
git clone https://github.com/nextstrain/nextclade cd nextclade make devThis will:
- configure conan profile
- install or update conan packages
- run cmake and generate makefiles
- build the project and tests
- run static analysis on source files
- run tests
- run CLI with parameters defined in
DEV_NEXTALIGN_CLI_OPTIONSandDEV_NEXTCLADE_CLI_OPTIONSenvironment variable (see.env.examplefile for defaults) - watch source files and rebuild on changes
💡 If you don't want to install Node.js and nodemon, or don't want the automatic watch & rebuild feature, you can use
make dev-nowatchinstead ofmake devduring development. In this case you will need to rerun the script on ode changes (as opposed to it rerunning automatically).🎉 You are ready! Start coding! In particular, take a look at these files and directories:
packages/nextalign_cli/src/ packages/nextalign_cli/src/cli.cpp # Entry pint of the Nextalign CLI executable packages/nextalign/src/ packages/nextalign/src/nextalign.cpp # Entry point of the library is the `nextalign()` function in this file packages/nextclade_cli/src/ packages/nextclade_cli/src/cli.cpp # Entry pint of the Nextclade CLI executable packages/nextclade/src/ packages/nextclade/src/nextclade.cpp # Entry point of the library is the `Nexatlign` classThe CLI binaries are produced in
.build/Debug/packages/nextalign_cli/nextalign_cli .build/Debug/packages/nextclade_cli/nextclade_cliThe tests binaries are in
.build/Debug/packages/nextalign/tests/nextalign_tests .build/Debug/packages/nextclade/tests/nextclade_testsThey are ran automatically upon rebuild. But you can run them directly too, if you'd like.
You can change the default arguments of the CLI invocation made by the
make devtarget by creating a.envfile:cp .env.example .env
and modifying the
DEV_NEXTALIGN_CLI_OPTIONSandDEV_NEXTCLADE_CLI_OPTIONSvariables or by setting these environment variables in the shell.💡 The default input files are located in
data/example💡 By default, the output files are produced in
tmp/directory in the root of the project.⚠️ Do not measure performance of executables produced withmake devand do not use them for real workloads. Development builds, with disabled optimizations and with enabled debugging tools and instrumentation, are meant for developer's productivity, not runtime performance, and can be orders of magnitudes slower than the optimized build. Instead, for any performance assessments, use benchmarks, profiling or production build. In real workloads always use the production build.
This section describes how to build the "production" or "release" versions of Nextclade CLI and Nextalign CLI. This are the builds that are shipped to end users. Production builds have performance optimizations enabled are are much faster, but it's harder to debug them.
For build inside a docker container, run
make docker-prodor, for local build, install the requirements from the "Develop locally" section and run:
make prodThis will produce the optimized executables in
.build/Release/packages/nextalign_cli/nextalign_cli
.build/Release/packages/nextclade_cli/nextclade_cli
as well as the final, stripped executables in
.out/bin/nextalign-Linux-x86_64
.out/bin/nextclade-Linux-x86_64
(replace Linux and x86_64 with your OS and hardware platform)
⚠️ Production build (and all builds withCMAKE_BUILD_TYPE=Releaseenforce standalone static executable) configuration.
Test are run as a part of the main development script (make dev). The test executables are built
to:
.build/Debug/packages/nextalign/tests/nextalign_tests
.build/Debug/packages/clade/tests/nextclade_tests
and can be invoked directly as needed.
We are using Google Test. See Google Test documentation and Google Mock documentation for more details.
The default dev scripts run the Nextalign CLI and Nextclade CLI under GDB (if installed), which serves a smoke test.
TODO: setup proper e2e tests. Compare results to known-well previous results and assert on differences.
We use the following static analysis tools.
clang-tidy, a part of LLVM project, is a static analysis (linter) tool. During development, it is recommended to use a text editor or an IDE which has clang-tidy integration. Check .clang-tidy file in the root of the project for current configuration.
Clang Static Analyzer (clang-analyzer), a part of LLVM project, is a source code analysis tool. Type
make dev-clang-analyzer
to build and run Nextalign CLI and Nextclade CLI with clang-analyzer and keep an eye on console warnings.
cppcheck runs as a part of the main development script (make dev). Keep an eye on console warnings. The file .cppcheck in the root of the project
contains arguments passed to cppcheck.
We use the following tools to perform runtime analysis of the builds.
Sanitizers are the binary instrumentation tools, which help to find various runtime issues related to memory management, threading and programming mistakes which lead to undefined behavior .
The project is set up to build with sanitizers, if one of the following CMAKE_BUILD_TYPEs is set:
| CMAKE_BUILD_TYPE | Effect |
|---|---|
| ASAN | Address + Leak sanitizers |
| MSAN | Memory sanitizer |
| TSAN | Thread sanitizer |
| UBSAN | Undefined behavior sanitizer |
💡 For example, if the program is crashing with a segfault, you could to try to run address sanitizer on it:
CMAKE_BUILD_TYPE=ASAN make dev
💡 Both GCC and Clang support these sanitizers to various degrees, but there might be kinks here and there. So you might need to try with both compilers (see: Use non-default compiler).
Set environment variable USE_VALGRIND=1 in order to run the executable with valgrind memcheck:
USE_VALGRIND=1 make devSet environment variable USE_MASSIF=1 in order to run the executable with valgrind massif heap profiler:
USE_MASSIF=1 make prodNote the process id in the header:
==263799== Massif, a heap profiler
It's 263799 in this example.
After valgrind is done, in order to visualize results, run ms_print, with the output filename, containing the process ID. For the example from above it will be:
ms_print massif.out.263799
A set of benchmarks is located
in packages/nextalign/benchmarks and in packages/nextclade/benchmarks.
We are using Google Benchmark framework. Read the
important Runtime and Reporting Considerations
.
⚠️ For the most accurate results, you should disable CPU frequence scaling for the time of your benchmarking session. (More info: [kernel] , [arch] , [debian])
💡 As a simple solution, on most modern hardware and Linux distros, before running benchmarks you could temporarily switch to
performancegovernor, withsudo cpupower frequency-set --governor performanceand then back to
powersavegovernor withsudo cpupower frequency-set --governor powersave
Run benchmarks with
make benchmarksThis will install dependencies, build the library and benchmarks in "Release" mode and will run the benchmarks. Benchmarks will rerun on code changes.
Or run the scripts/benchmarks.sh directly (no hot reloading).
You can also run the executables directly, which are located in
.build/Benchmarks-Release/packages/nextalign/benchmarks/nextalign_benchmarks
.build/Benchmarks-Release/packages/nextclade/benchmarks/nextclade_benchmarks
💡 For better debugging experience, you can also build in "Debug" mode and run under GDB with:
CMAKE_BUILD_TYPE=Debug USE_GDB=1 make benchmarks
You can pass parameters to the benchmark executable with either of:
BENCHMARK_OPTIONS='--help' make benchmarks
scripts/benchmarks.sh --helpFor example, you can filter the benchmarks by name: to run only the benchmarks containing the word "Average":
BENCHMARK_OPTIONS='--benchmark_filter=.*Average' make benchmarksThe results are also saved to the files
.reports/nextalign_benchmarks.json
.reports/nextclade_benchmarks.json
You can compare multiple results using the compare.py tool from Google Benchmark repository. For more information refer to Benchmark Tools documentation.
make profileTODO: expand this section
You can tell build scripts to forcefully use Clang instead of the default compiler (e.g. GCC on Linux) by setting the
environment variable USE_CLANG=1. For example:
USE_CLANG=1 make dev
USE_CLANG=1 make prod
CMAKE_BUILD_TYPE=ASAN USE_CLANG=1 make dev
In this case, binaries will be produced in directories postfixed with -Clang, e.g. .build/Debug-Clang.
💡 On Ubuntu you can build LLVM project (including Clang) with a script provided in
scripts/deps/build_llvm.sh. It depends on binutils which should be built withscripts/deps/build_binutils.shprior to that. There is also a script to build GCC:scripts/deps/build_gcc.sh. Refer to comments inside these scripts for the list of dependencies required. As a result of these scripts, the ready-to-use compilers will be in3rdparty/gccand3rdparty/llvm,
💡 The projects' build system is setup to automatically pickup the
gccandg++executables from3rdparty/gcc/bin/, andclangandclang++executables from3rdparty/llvm/bin/if any of those exist.
To simplify distribution to end users, we produce standalone, statically linked binaries, as well as a minimalistic docker image, containing only single executable.
By default static build is enable for all builds that have CMAKE_RELEASE_TYPE=Release (that is, production build and benchmarks). It can be selectively enabled or disabled during build time, using environment variables NEXTALIGN_STATIC_BUILD="(0|1)" and NEXTCLADE_STATIC_BUILD="(0|1)":
NEXTALIGN_STATIC_BUILD=1 make dev # produces statically-linked dev build
NEXTALIGN_STATIC_BUILD=0 make prod # produces dynamically-liked prod buildSee PR #7 for caveats and other considerations.
Runtime performance is important for this project and for production builds we use a gold-plugin-enabled linker executable.
TODO: this is currently not true. We need to setup LTO on CI.
💡 On Ubuntu you can build it along with other binutils using the provided script in
scripts/deps/build_binutils.sh. The results of the build will be in3rdparty/binutils.
💡 The projects' build system is setup to automatically pickup the
ldlinker from3rdparty/binutils/bin/if it exists.
TODO: setup profile-guided optimization based on CLI executable or e2e tests
-
Increment version in
VERSIONfile in the root directory -
Write release notes in a new section in the beginning of
CHANGELOG.md. Make it friendly and comprehensible for the users. Note that this changelog will appear in the "What's new" popup dialog of next released version of Nextclade Web as well. -
Merge changes to
release-clibranch (do not create a tag!). In most cases you want to simply release what's onmasterbranch. In this case fast-forward therelease-clibranch tomasterbranch. Push the changes to the remote. -
Upon pus, CI will trigger a build and
- run build script
- upload binaries to Github Releases
- build and push Docker image to Docker Hub
- create and push a git tag
-
After GitHub Release is created by CI, edit it and paste the release notes for this version into the description
TODO: automate publication of release notes on GitHub Releases
As a workaround you may try to add the new compiler to the PATH and delete and regenerate conan profile:
- Remove the old conan profile by deleting
.cache/.conandirectory - rebuild the project and and watch for
compiler=<COMPILER_NAME>andcompiler.version=<VERSION>in the output in console output of the "Install dependencies" build step, and/or setCMAKE_VERBOSE_MAKEFILE=1variable and check the compiler path used during "Build" step.
The error might look similar to this (click to open):
Possibe CMake error
CMake Error: The current CMakeCache.txt directory .build/Debug/CMakeCache.txt is different than the directory /src/.build/Debug where CMakeCache.txt was created. This may result in binaries being created in the wrong place. If you are not sure, reedit the CMakeCache.txt
CMake Error: The source "CMakeLists.txt" does not match the source "/src/CMakeLists.txt" used to generate cache. Re-run cmake with a different source directory.
You are probably trying to run local build after running docker-based build.
Docker build uses the same host directory as local build, but the paths inside container are different. That's why CMake gets confused.
Simply delete the current build directory, e.g. .build/Debug or the entire .build, and rerun, so that CMake can regenerate its cache with the correct paths.
Try to remove the temporary directories: .build, .cache, .out, .reports, tmp and rebuild.
Note that removing conan cache in .cache/conan will require downloading and rebuilding of all of the dependencies on next build, which happens automatically, but is time-consuming.
Feel free to create a new Github Issue or to join Nextstrain Discussion at discussion.nextstrain.org.