Debugging Julia with Address Sanitizer24 Feb 2017
Address sanitizer is a useful tool for debugging various memory problems, from invalid accesses to mismanagement or leaks. It is similar to Valgrind’s memcheck, but uses compile-time instrumentation to lower the cost.
In this post I’ll explain how to use Clang’s address sanitizer (or ASAN) with Julia. This is somewhat tricky, as the Julia compiler uses LLVM for code generation purposes. Long story short, this implies that all instances of LLVM (ie. the one Julia is compiled with, and the one used for code generation) have to match up exactly for the instrumentation to work as expected.
We’ll start by building a toolchain to compile Julia with. As mentioned before, all LLVM instances in play have to match up exactly for instrumentation to work, so we’ll use Julia’s build infrastructure to generate us an LLVM toolchain.
Start by checking-out Julia, and creating an out-of-tree build directory:
$ git clone https://github.com/JuliaLang/julia $ cd julia $ make O=configure sanitize_toolchain
This build will need to provide
clang, so create a
BUILD_LLVM_CLANG=1. In addition, LLVM does not build its sanitizers with autotools, so add
override LLVM_USE_CMAKE=1 to that file as well. And because that triggers LLVM bug
#23649, also add
make install-llvm from the
deps subfolder. When it
finishes, check if binaries have been written to
usr/bin (due to what’s probably a bug in
LLVM’s build scripts), and move them to
usr/tools if they have.
Now that we have a working toolchain, we’ll use it to compile a sanitized version of the
Julia compiler and libraries. Start by creating a new out-of-tree build directory using
make O=configure sanitize. But this time, our
Make.user will be significantly more
TOOLCHAIN=$(BUILDROOT)/sanitize_toolchain/usr/tools # use our new toolchain USECLANG=1 override CC=$(TOOLCHAIN)/clang override CXX=$(TOOLCHAIN)/clang++ export ASAN_SYMBOLIZER_PATH=$(TOOLCHAIN)/llvm-symbolizer # enable ASAN override SANITIZE=1 override LLVM_SANITIZE=1 # autotools doesn't have a self-sanitize mode override LLVM_USE_CMAKE=1 # make the GC use regular malloc/frees, which are intercepted by ASAN override WITH_GC_DEBUG_ENV=1 # default to a debug build for better line number reporting override JULIA_BUILD_MODE=debug
Now kick-off the build using
make from the
sanitize build directory. Barring any memory
issues triggered during system image generation, this should yield a sanitized
binary and system image.
Running the test-suite
The test-suite is a beast, and because ASAN keeps track of a lot of information it easily takes over 128GiB of memory to run it to completion. Instead, we’ll tune ASAN to consume less memory at the expense of accuracy and report detail.
Julia however already configures default ASAN
which we need to copy when specifying a different set. Do so by defining the
ASAN_OPTIONS environment variable and assigning it the value of
This copies aforementioned default values, and caps backtrace collection.
Using CUDA packages
If you thought all that was convoluted, prepare for some more. ASAN uses so-called shadow memory to store information about memory allocations. There is a correspondence between regular memory addresses and their shadow counterpart, and this mapping is fixed in order to keep the instrumentation overhead low. Sadly, the default shadow memory location overlaps with fixed memory allocated by CUDA (presumably for its unified virtual address space).
Because the shadow memory is fixed, we need to patch both instances of LLVM (easiest to add
a patch to
llvm.mk) and have it pick a different shadow offset:
--- lib/Transforms/Instrumentation/AddressSanitizer.cpp +++ lib/Transforms/Instrumentation/AddressSanitizer.cpp @@ -359,7 +359,7 @@ if (IsKasan) Mapping.Offset = kLinuxKasan_ShadowOffset64; else - Mapping.Offset = kSmallX86_64ShadowOffset; + Mapping.Offset = kDefaultShadowOffset64; } else if (IsMIPS64) Mapping.Offset = kMIPS64_ShadowOffset64; else if (IsAArch64) --- projects/compiler-rt/lib/asan/asan_mapping.h +++ projects/compiler-rt/lib/asan/asan_mapping.h @@ -146,7 +146,7 @@ # elif SANITIZER_IOS # define SHADOW_OFFSET kIosShadowOffset64 # else -# define SHADOW_OFFSET kDefaultShort64bitShadowOffset +# define SHADOW_OFFSET kDefaultShadowOffset64 # endif # endif #endif
Note that you might need to redefine a different macro for your platform.
Sanitizing older versions of Julia
If you want to sanitize older versions of Julia, before the switch to LLVM
3.9, there’s yet other issues: only LLVM
3.9 is compatible with recent versions of
glibc, while the CMake build system of LLVM 3.7 doesn’t
export all necessary public symbols. You can
work around these issues by using a sufficiently old system, and overriding the LLVM version
to 3.8 (by specifying
override LLVM_VER=3.8.1 in the
Make.user of both build
directories) or preventing it from generating a shared library (by specifying
USE_LLVM_SHLIB=0 in the
Make.user of the final Julia build).