Thanks to a post on the BioCode's Notes blog I have discovered that there is a project called BioDocker which aims to generate lots of Docker containers to help make bioinformatics more reproducible by standardizing how bioinformatics software is packaged. From the BioDocker website:
The main purpose of this project is to spread the use of Docker on the Bioinformatics and Computational Biology areas. By using pre-configured containers with different bioinformatic softwares some critical aspects of Bioinformatics like reproducibility are minimized. Here you will find a list of containers with different bioinformatics software and how to use it.
BioDocker was created by Felipe da Veiga Leprevost in 2014, and the associated GitHub repository currently has a dozen or so containers.
When I was first read about BioDocker I was confused because I know that there is also the Bioboxes project which aims to er…make bioinformatics more reproducible by standardizing how bioinformatics software is packaged. From the Bioboxes manifesto:
Software has proliferated in bioinformatics and so have the problems associated with it: missing or unobtainable code, difficult to install dependencies, unreproducible workflows, all with terrible user experiences. We believe a community standard, using software containers, has the opportunity to solve these problems and increase the standard of scientific software as a whole.
I think the aims of these two projects are similar, but not identical and Bioboxes probably has a broader remit. Both projects are aware of each other and it looks like they have had some productive exchanges.
All of this makes me feel that the bioinformatics community seems to be slowly, but steadily, embracing Docker. Any approaches to standardize how we do bioinformatics should be welcomed, but some of us with long memories will recall that we have been in this situation before. Anyone remember the promises of how CORBA and then SOAP were going to increase interoperability in bioinformatics?