Sowing the seeds of bad bioinformatics names

Here are two simple pieces of advice for people who are looking for a name for their latest bioinformatics tool/database/resource:

  1. Avoid common words which might cause people searching for your tool to find something else instead.
  2. Choose a name that hasn't been used before by the bioinformatics community.

Having said that, let's look at a new paper in the journal Bioinformatics:

Seed: a user-friendly tool for exploring and visualizing microbial community data

This name 'Seed', is a not-too-offensive acronym for Simple Exploration of Ecological Data. So what's my beef with it?

The problem is that words like seed are going to appear all over the Internet. My standard test for the 'searchability' of a bioinformatics tool is to search for the tool name followed by the word 'bioinformatics'. Your resource's website or publication should hopefully be the number one result (or somewhere on the first page). However, that is not what happens here.

And searching for 'seed bioinformatics' raises more problems by clashing with my first piece of advice. E.g. here are a couple of papers that were in my first page of Google results:

2010: Accessing the SEED Genome Databases via Web Services API: Tools for Programmers

2011: SEED: efficient clustering of next-generation sequences

So what happens if you include 'microbial' into your search terms? Won't that help?

Nope. Turns out that the SEED — not an ancronym as far as I can tell — is an annotation environment for microbial genomes that has been around for a decade, and which has spawned many papers, e.g.:

2014: The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

All of which means that people looking to find the newly published Seed tool, are not going to have much luck when using search engines.