ORCID: binding the (academic) galaxy together

Adapted from picture by flickr user Jim & Rachel McArthur

I am a supporter of ORCID's goals to help establish unique identifiers for researchers. Such identifiers can then be used to help connect a researcher with all of their inputs and outputs that surround their career. Most fundamentally, these inputs and outputs are grants and papers, but there is the potential for ORCID identifiers to link a person to much more, e.g. the organisations that they work for, manuscript reviews, code repositories, published slides, even blog posts.

For ORCID to succeed it has to be global and connect all parts of the academic network, a network that spans national boundaries. On this point, I am very impressed by the effort that ORCID makes in ensuring that their excellent outreach materials are not only available in English. As shown below, ORCID's 'Distinguish yourself' flyer is available in 9 different languages. Other material is also available in Russian, Greek, Turkish, and Danish. If your desired language is not available, they welcome volunteers to help translate their message into more languages. Email community@orcid.org if you want to help.

Welcome to the JABBA menagerie: a collection of animal-themed, bogus bioinformatics names…that have nothing to do with animals!

Bioinformaticians make the worst zookeepers:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

 

Other suggestions welcome! Only requirements are that:

  1. The name is bogus, i.e. not a straightforward acronym and worthy of a JABBA award
  2. The acronym is named after an animal (or animal grouping)
  3. The software/tool has nothing to do with the animal in question

Great Scott! Five fun facts about DNA sequencing from 1985

As everyone is celebrating a certain 2015–themed calendar event today, I thought we could instead go back to the future past of DNA sequencing.

 

1.

Thirty years ago there were no automated sequencing machines. However, Sanger sequencing technology could still provide longer reads than most of Illumina's machines today, e.g. from this paper (A rapid procedure for DNA sequencing using transposon-promoted deletions in Escherichia coli):

The length of the sequence that could be read from each gel in a single run varied from 175 to 200 nt.

 

2.

The idea of sequencing nuclear genomes was still largely a pipe dream, but smaller genomes were tractable. 1985 saw the addition of the Xenopus laevis mitochondrial genome to the tiny collection of organelle genome sequences. Figure 3 of this paper displayed the full sequence, spread over six pages that looked like this:

Including long DNA sequences in journal articles was a surprisingly common practice at this time.

 

3.

There were two releases of GenBank in 1985. The second release saw the database grow to an astounding set of 5,700 sequences, totalling 5,204,420 bp. For comparison, this year also saw the release of the Commodore 128 home computer which came with 128 KB of RAM. The first 3.5" hard drives were only a couple of years old, and could store 10 MB (so capable of storing the DNA sequences in GenBank, but possibly not the associated annotation).

 

4.

The SEQ-ED program was published, allowing the handling of 'long DNA sequences' that were 'up to 200 Kbp'.

 

5.

Somewhat amazingly, people were writing bioinformatics software for Apple computers. The journal CABIOS included this paper:

But how did people distribute software in the days when there was no GitHub, SourceForge, or indeed…no world wide web?

For both code and source of PEGASE, please send two blank 5" diskettes and indicate precisely your system configuration (there is a slight difference between the Apple II+ and the Apple lIe version which depends on the availability of lower case characters).