This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.
Laura Clarke is the Project Coordinator for Resequencing Informatics, part of the Vertebrate Genomics team led by Paul Flicek at EMBL-EBI. Before joining the EBI, she was applying her considerable bioinformatics skills at the Wellcome Trust Sanger Institute (a move ranked #1 on the annual list of Easiest-employers-to-transition-between).
Her role sees her help with the analysis and coordination of high throughput genomics efforts such as the 1000 Genomes project, BLUEPRINT (deciphering the epigenome of blood cells), and HipSci (the Human Induced Pluripotent Stem Cells Initiative). If you're wondering what this actually entails, I'll hand you over to Laura:
"This work boils down to making sure that data gets into and out of the sequence archives; running primary analysis and QC; and then making sure the resulting analysis makes it out to the community".
001. What's something that you enjoy about current bioinformatics research?
The possibility. With modern sequencing technologies, computation techniques have the ability to draw together these new data types and massive volumes of data, allowing us to get much closer to a proper understanding of cellular biology, which of course brings us closer to understanding organismal biology.
Add to that the diverse range of species being sequenced and what that can teach us about evolution and the forces which drive evolution.
That is of course before you consider how it might impact medicine or food security or any real world applications.
010. What's something that you *don't* enjoy about current bioinformatics research?
Extracting data from people. My life would be easier if people weren't so begrudging about sharing data and describing the data they do share well. I work with many people who do share data freely and easily but there are still too many people who are too reticent or reluctant to make data publicly available from within a consortium.
011. If you could go back in time and visit yourself as an 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?
For data coordination purposes we produce a lot of tab-delimited text files, cut is a wonderful Unix command for making those easier to work with and manipulate, learning about cut sooner would have at least made mucking about with various types of GFF files easier I suspect.
100. What's your all-time favorite piece of bioinformatics software, and why?
I have to say I did enjoy pairedends.com, very funny
101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality?
R: this is because Adenine and Guanine are the same molecule type (purines) as both Theobromine and Caffeine, both of which are quite important to me and at least influence my personality.