Another hard-to-pronounce bioinformatics software name

This was from a few months ago, published in the journal Nucleic Acids Research:

So how do you pronounce 'FunFHMMer'? I can imagine several possibilities:

  1. Fun-eff-aitch-em-em-er
  2. Fun-eff-aitch-em-mer
  3. Fun-eff-hammer
  4. Fünf-hammer

Reading the manuscript suggests that 'FunF' stems from 'FunFam(s)' which in turn is derived from 'functional families'. This would suggest that options 1 or 3 above might be the correct way to pronounce this software's name.

The fully expanded description of this web server's name becomes a bit of a mouthful:

Class Architecture Topology Homologous Superfamily Functional Families Hidden Markov Model (maker?)

If you want your bioinformatics software to have a memorable name, it helps if the name is pronounceable

Image from she is geeky blog

There is a new paper in the journal Bioinformatics:

The paper describes a new method for implementing a Principle Components Analysis (PCA) of data. That new method has a name. That name has just seven characters. How hard can it be to pronounce?

  • S4VDPCA: ess-four-vee-dee-pee-cee-ay

It doesn't exactly trip off the tongue and having four 'ee-sounding' letters together (VDPC) doesn't make it easy to remember. When I first came across this paper, I skimmed the article, waited an hour, and then tried to remember the name. I could remember that it included '4', 'V', and 'D', but couldn't remember the order (or that it also included an 'S')

It is by no means essential that bioinformatics tools have easily pronounceable names, but this will help people remember the name of your software. In turn, this makes it easier for people to tell others about your software. I don't imagine that bioinformatics software developers ever want to overhear the following type of conversation:

Bob: "You should use that tool"

Sue: "What tool?"

Bob: "Umm, you know that PCA thingy. The S…something, something…PCA tool"

Sue: "The what?"

Bob: "Run a Google search for Bioinformatics PCA tools, it's probably the top hit."

Sue: <- facepalm ->

Unpronounceable bioinformatics database names

First a quick reminder that an acronym is something that is meant to be pronounced as an entire word (e.g. NATO, AIDS etc.). Sometimes these end up becoming regular, non-capitalized, words (e.g. radar, laser).

In contrast, an initialism is something where the component letters are read out individually (e.g. BBC, CPU). In bioinformatics, there are also names which are part acronym and part initialism (e.g.GWAS…which I have only every heard pronounced as gee-was).

Most initialisms that we use in everday life tend to be short (2–4 letters) because this makes them easier to read and to pronounce. As you move past 4 letters, you run the risk of making your initialism unprouncible and unmemorable.

So here are some recently published bioinformatics tools with names that are a bit cumbersome to repeat. For each one I include how someone might try to pronounce them. Try repeating these names quickly and for an added test, see how many of these names you can remember 5 minutes after you read this:

5 characters

6 characters

7 characters

And the winner goes to…

Conclusions

If you want people to actually use your bioinformatics tools, then you should aim to give them names that are memorable and pronounceable.

How would you pronounce the name of this bioinformatics tool?

From the latest issue of Bioinformatics we have a new tool that is an R package for the analysis of GWAS studies. Rather than name the tool, I want you all to first see it exactly as it appears in the journal:

The first character in the name of this software is a character which can often be hard to identify, particularly when certain fonts makes it look like it could be the letters L or I, or even the number 1.

This is not a name that is worthy of a JABBA-award, but it does fall in to my category of posts which I call almost JABBA, for software names that have various other issues. The particular issue in this case is that the name is hard to read and therefore hard to pronounce. I feel that the use of lower-case characters makes it more likely that the reader will attempt to pronounce this as a word, rather than read it as an initialism. E.g. maybe you saw this name and read it as 'Lurgpurr', or 'Ergpurr'.

The reason behind the name is not explained in the article, but when you go to the linked software page, all is revealed:

It's a bit odd that one of the five words that appear in this name ('Gaussian') doesn't get mentioned anywhere in the paper. But more importantly, why did they feel the need for using lower-case characters? 'LRGPR' would have been much easier to read and comprehend than the font-dependent 'lrgpr'.

 

Unpronounceable — why can't people give bioinformatics tools sensible names?

Okay, so many of you know that I have a bit of an issue with bioinformatics tools with names that are formed from very tenuous acronyms or initialisms. I've handed out many JABBA awards for cases of 'Just Another Bogus Bioinformatics Acronym'. But now there is another blight on the landscape of bioinformatics nomenclature…that of unpronounceable names.

If you develop bioinformatics tools, you would hopefully want to promote those tools to others. This could be in a formal publication, or at a conference presentation, or even over a cup of coffee with a colleague. In all of these situations, you would hope that the name of your bioinformatics tool should be memorable. One way of making it memorable is to make it pronounceable. Surely, that's not asking that much? And yet…

There is a lot of bioinformatics software in this world. If you choose to add to this ever growing software catalog, then it will be in your interest to make your software easy to discover and easy to promote. For your own sake, and for the sake of any potential users of your software, I strongly urge you to ask yourself the following five questions:

  1. Is the name memorable?
  2. Does the name have one obvious pronunciation?
  3. Could I easily spell the name out to a journalist over the phone?
  4. Is the name of my database tool free from any needless mixed capitalization?
  5. Have I considered whether my software name is based on such a tenuous acronym or intialism that it will probably end up receiving a JABBA award?