101 questions with a bioinformatician #6: Mario Caccamo

This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.

Mario Caccamo is the director of The Genome Analysis Centre, a BBSRC-sponsored research institute focused on genomics and computational biology. You may know of this institute by its shortened name 'TGAC' (of course, this is not the only place where you will see this initialism). As Director, Mario's role is to "ensure TGAC is equipped with the resources and people to deliver good science".

Mario's skills as a bioinformatician are matched only by his prowess on the volleyball court. When he used to play for the Informatics volleyball team at the Wellcome Trust Sanger Institute, he deservedly earned the nickname Super Mario. You can find out more about Mario by following him on twitter (@mcaccamo).


001. What's something that you enjoy about current bioinformatics research?

I see bioinformatics as a branch of molecular biology. I love the elegance of molecular biology. Research in bioinformatics is about capturing the beauty of biology in abstractions that can help us to discover new knowledge. Beauty here means complexity, optimisation, economy (in terms of information content) and functionality (among other things). One of the most exciting things about molecular biology is how young it is as a science. Our colleagues working in bioinformatics are the Newtons and the Keplers of molecular biology — so much to be done and discovered. 


010. What's something that you *don't* enjoy about current  bioinformatics research?

The flip side is that we still don’t understand enough about the basic building blocks in molecular biology. The language we use to describe biological systems and processes is incomplete. We struggle with issues that sometimes look simple. What did we know about epigenetic modifications 10 years ago for instance? Little compared to what we know today. Bioinformaticians struggle with the incompleteness of the underlying basic knowledge and keep re-inventing the wheel leading to frustration. Perhaps these are growing pains — but pains nevertheless.


011. If you could go back in time and visit yourself as an 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?

My advice would be: “This is good enough. Let it go.” Recognising when you are in the land of diminishing returns is a skill that should be taught at school. This is particularly relevant for bioinformatics. You can always close more gaps, find the missing gene or remove another false positive — you can do this for the next 10 years. My recommendation is...don't. Another perhaps more mundane recommendation is to learn either gawk, Perl one-liners, or some of the basic Unix command line tools to manipulate strings and text; they will give you the data you need for your best presentation the night before your talk. 


100. What's your all-time favorite piece of bioinformatics software, and why?

It has to be HMMER — a beautiful super efficient piece of software. I know that we shouldn’t use a hammer for all kind of different nails but somehow HMMER manages to prove that advice wrong. You can HMMER so many nails with this hammer.


101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality?

I think I would take W. I like W as a strange letter (no explanation for that) — but it is A or T, alpha or omega in the nucleotide alphabet.