101 questions with a bioinformatician #24: Sara Gosline

This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.

Sara Gosline is a postdoc at the Fraenkel Lab, in the Department of Biological Engineering at MIT. Her current work has focused on studying the impact of microRNA changes on global mRNA expression. As her postdoc comes to an end, Sara is seeking a tenure-track faculty position to further explore the broader impacts of RNA regulation to better interpret gene expression data in a network context (contact her if interested).

I'm glad to see more and more bioinformaticians recognizing the need to make sure that code is properly documented and made easily available to others. Addressing this issue, Sara writes:

Parsing someone else’s code is often close to impossible. So, I’ve been working to try to make some of the code we use in the lab more useable by people outside the lab. It’s so easy to do some analysis in a basic script and publish the method but never share the code, so it’s fun to take the time to really make sure tools are usable.

I wish more people had this attitude towards sharing their software!

You can find out more about Sara by visiting her website (), or by following her on twitter (@sargoshoe). She also tweets using the @CancerSysBio account (along with other early stage investigators in Cancer Systems Biology). And now, on to the 101 questions...

001. What's something that you enjoy about current bioinformatics research?

Personally, bioinformatics combines two problem-solving activities that I really enjoy: (1) Trying to figure out the biological mechanisms of disease and (2) trying to figure out how to prove/show this using data, computation, and statistics, together with experimental methods. In this regard it's all about 'the chase': trying to find the right approach/statistic/method to uncover the hidden meaning of the data.

On a broader scale, bioinformatics research is also fantastic because it provides instant access to a 30,000 foot view of biology. Within a matter of hours, we can look up a specific gene measurement in a published dataset or repository without having to wait for hours/days/weeks to test behavior experimentally. This is why I think basic bioinformatics training is crucial to all biologists. It provides the ability to ask and answer high-level questions that are required (in addition to basic bench skills) to become successful scientists going forward.

010. What's something that you don't enjoy about current bioinformatics research?

Identifier mapping! I spent my PhD doing evolutionary work, and trying to map genes, orthologous interactions, etc was so incredibly painful. I often joke that I got my PhD in identifier mapping rather than in bioinformatics. There have been huge strides along these lines, though working in model organisms and humans has helped me avoid this!

011. If you could go back in time and visit yourself as a 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?

I wish I had the courage to dive into bioinformatics earlier on in my career. I left biology for computer science in my second year of college after taking my first programming course, and at that point (late 1990s) it seemed like a more flexible career path. It took me two years in industry and another year into my masters before I fully committed to the field. Now, biological data analysis is mainstream (with lots of jobs), and the skills that any student develops by studying bioinformatics are so useful they can be applied to any big data job at companies like Google and Facebook.

100. What's your all-time favorite piece of bioinformatics software, and why?

R and Bioconductor. I certainly use a lot of Python in my day-to-day work, but I love R for its simplicity and straightforward approach to basic analysis. It makes introducing biologists to bioinformatics much easier, enabling people with no programming background to perform complex tasks such as survival analysis or p-value correction in just a few steps. It also dovetails well with other programming languages and has a great platform for sharing code/tools among scientists.

101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality, and why?

N - my career is currently under flux, so my precise biological role cannot be determined until I have a job for next year!