Another hard-to-pronounce bioinformatics software name

This was from a few months ago, published in the journal Nucleic Acids Research:

So how do you pronounce 'FunFHMMer'? I can imagine several possibilities:

  1. Fun-eff-aitch-em-em-er
  2. Fun-eff-aitch-em-mer
  3. Fun-eff-hammer
  4. Fünf-hammer

Reading the manuscript suggests that 'FunF' stems from 'FunFam(s)' which in turn is derived from 'functional families'. This would suggest that options 1 or 3 above might be the correct way to pronounce this software's name.

The fully expanded description of this web server's name becomes a bit of a mouthful:

Class Architecture Topology Homologous Superfamily Functional Families Hidden Markov Model (maker?)

If you want your bioinformatics software to have a memorable name, it helps if the name is pronounceable

Image from  she is geeky blog

There is a new paper in the journal Bioinformatics:

The paper describes a new method for implementing a Principle Components Analysis (PCA) of data. That new method has a name. That name has just seven characters. How hard can it be to pronounce?

  • S4VDPCA: ess-four-vee-dee-pee-cee-ay

It doesn't exactly trip off the tongue and having four 'ee-sounding' letters together (VDPC) doesn't make it easy to remember. When I first came across this paper, I skimmed the article, waited an hour, and then tried to remember the name. I could remember that it included '4', 'V', and 'D', but couldn't remember the order (or that it also included an 'S')

It is by no means essential that bioinformatics tools have easily pronounceable names, but this will help people remember the name of your software. In turn, this makes it easier for people to tell others about your software. I don't imagine that bioinformatics software developers ever want to overhear the following type of conversation:

Bob: "You should use that tool"

Sue: "What tool?"

Bob: "Umm, you know that PCA thingy. The S…something, something…PCA tool"

Sue: "The what?"

Bob: "Run a Google search for Bioinformatics PCA tools, it's probably the top hit."

Sue: <- facepalm ->

Unpronounceable bioinformatics database names

First a quick reminder that an acronym is something that is meant to be pronounced as an entire word (e.g. NATO, AIDS etc.). Sometimes these end up becoming regular, non-capitalized, words (e.g. radar, laser).

In contrast, an initialism is something where the component letters are read out individually (e.g. BBC, CPU). In bioinformatics, there are also names which are part acronym and part initialism (e.g.GWAS…which I have only every heard pronounced as gee-was).

Most initialisms that we use in everday life tend to be short (2–4 letters) because this makes them easier to read and to pronounce. As you move past 4 letters, you run the risk of making your initialism unprouncible and unmemorable.

So here are some recently published bioinformatics tools with names that are a bit cumbersome to repeat. For each one I include how someone might try to pronounce them. Try repeating these names quickly and for an added test, see how many of these names you can remember 5 minutes after you read this:

5 characters

6 characters

7 characters

And the winner goes to…

Conclusions

If you want people to actually use your bioinformatics tools, then you should aim to give them names that are memorable and pronounceable.