Summary
This talk introduces HashDist, a critical component of the scientific software development workflow. HashDist enables highly customizable, source-driven, and reproducible builds for scientific software stacks. HashDist builds can be made relocatable, allowing the easy redistribution of binaries on all three major operating systems as well as cloud and supercomputing platforms.
Description
Developing scientific software is a continuous balance between not reinventing the wheel and getting fragile codes to interoperate with one another. Binary software distributions such as Anaconda provide a robust starting point for many scientific software packages, but this solution alone is insufficient for many scientific software developers. HashDist provides a critical component of the development workflow, enabling highly customizable, source-driven, and reproducible builds for scientific software stacks, available from both the IPython Notebook and the command line.
To address these issues, the Coastal and Hydraulics Laboratory at the US Army Engineer Research and Development Center has funded the development of HashDist in collaboration with Simula Research Laboratories and the University of Texas at Austin. HashDist is motivated by a functional approach to package build management, and features intelligent caching of sources and builds, parametrized build specifications, and the ability to interoperate with system compilers and packages. HashDist enables the easy specification of "software stacks", which allow both the novice user to install a default environment and the advanced user to configure every aspect of their build in a modular fashion. As an advanced feature, HashDist builds can be made relocatable, allowing the easy redistribution of binaries on all three major operating systems as well as cloud, and supercomputing platforms. As a final benefit, all HashDist builds are reproducible, with a build hash specifying exactly how each component of the software stack was installed.
This talk will feature an introduction to the problem of packaging Python-based scientific software, a discussion of the basic tools available to scientific Python developers, and a detailed discussion and demonstration of the HashDist package build manager.
The HashDist documentation is available from: http://hashdist.readthedocs.org/en/latest/ HashDist is currently hosted at: https://github.com/hashdist/hashdist