10 years of Open Access at the Wellcome Trust in 10 numbers [Link]

A great summary of how the Wellcome Trust has helped drive big changes in open access publishing. Of the ten numbers that the post uses to summarise the last decade, this one surprised me the most:

20% – the volume of UK-funded research which is freely available at the time of publication
A recent study commissioned by Universities UK found that 20% of articles authored by UK researchers and published in the last two years were freely accessible upon publication. This figure increases to 24% within six months of publication, and 32% within 12 months.

If you had asked me to guess what this number would be, I think I would have been far too optimistic. Even the figure of 32% of articles being free within 12 months seems lower than I would imagine. Lots of progress still to be made!

Teaser: a solution for our read mapping dilemma?

A paper recently published in Genome Biology by Smolka et al. may offer some help to the problem of choosing which read mapping program to use in order to align a set of sequencing reads to a genome:

The paper starts by neatly summarising the problem:

Recent and ongoing advances in sequencing technologies and applicationslead to a rapid growth of methods that align next generation sequencing reads to a reference genome (read mapping). By mid 2015, nearly 100 different mappers are available, although not all are equally suited for a given application or dataset.

The program Teaser attempts to automate the benchmarking of not just different mappers, but also (some of) the different parameters that are available to these programs. The latter problem should not be underestimated. The Bowtie 2 program describes almost 100 different command-line options in its documentation and many of these options control how Bowtie runs and/or what output it generates.

Teaser uses small sets of simulated read data, leading to very quick run times (< 30 minutes for many comparisons), but you can also supply real data to it. By default, Teaser will test the performance of five read mapping programs: BWA, BWA-MEM, BWA-SW, Bowtie2, and NextGenMap.

Impressively, you can run Teaser on the web as well as a standalone program. The web output includes results displayed graphically for many different test datasets (x-axis):

The paper concludes by asking the community to submit optimal parameter combinations to the Teaser GitHub repository

Teaser is easy to use and at the same time extendable to other methods and parameters combinations. Future work will include the incorporation of benchmarking RNA-Seq mappers and variant calling methods. We furthermore encourage the scientific community to contribute the optimal parameter combinations they detected to our github repository (available at github.com/Cibiv/Teaser) for their particular organism of interest. This will help others to quickly select the optimal combination of mapper and parameter values using Teaser.

I can't wait for the companion program Firecat!

 

2015-10-26 11.05: Updated to remove specific references to software versions of mapping tools.


Help us do science! I’ve teamed up with researcher Paige Brown Jarreau to create a survey of ACGT readers. By participating, you’ll be helping me improve ACGT and contributing to the SCIENCE on blog readership. You will also get FREE science art from Paige's Photography for participating, as well as a chance to win a t-shirt and other perks! It should only take 10–15 minutes to complete.

You can find the survey here: http://bit.ly/mysciblogreaders

ORCID: binding the (academic) galaxy together

Adapted from picture by flickr user Jim & Rachel McArthur

I am a supporter of ORCID's goals to help establish unique identifiers for researchers. Such identifiers can then be used to help connect a researcher with all of their inputs and outputs that surround their career. Most fundamentally, these inputs and outputs are grants and papers, but there is the potential for ORCID identifiers to link a person to much more, e.g. the organisations that they work for, manuscript reviews, code repositories, published slides, even blog posts.

For ORCID to succeed it has to be global and connect all parts of the academic network, a network that spans national boundaries. On this point, I am very impressed by the effort that ORCID makes in ensuring that their excellent outreach materials are not only available in English. As shown below, ORCID's 'Distinguish yourself' flyer is available in 9 different languages. Other material is also available in Russian, Greek, Turkish, and Danish. If your desired language is not available, they welcome volunteers to help translate their message into more languages. Email community@orcid.org if you want to help.

Welcome to the JABBA menagerie: a collection of animal-themed, bogus bioinformatics names…that have nothing to do with animals!

Bioinformaticians make the worst zookeepers:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

 

Other suggestions welcome! Only requirements are that:

  1. The name is bogus, i.e. not a straightforward acronym and worthy of a JABBA award
  2. The acronym is named after an animal (or animal grouping)
  3. The software/tool has nothing to do with the animal in question