Learn my Linux Bootcamp…all from within a web browser window

I awoke yesterday to see a lot of twitter notifications on my phone. Sometimes this happens when I've written a post on this blog, but I hadn't added anything for over a week. Turns out that the activity was triggered by this tweet by Richard Smith-Unna (@blahah404 on twitter):

As the screenshot below indicates, Richard has worked some amazing black magic to enable a single browser window to contain a fully interactive terminal as well as a file viewer/navigator; all alongside a (slightly modified) version of my original Linux bootcamp material.

Click to enlarge

This new interactive command-line bootcamp is a wonderful resource and means that the only barrier to learning some simple, but powerful, Linux/Unix commands is the availability of a web browser.

Richard explains a little about how he put all of this together:

The Infrastructure, including adventure-time and docker-browser-server, was built by @maxogden and @mafintosh. The setup of this app was based on the get-dat adventure.

Command-line bootcamp: learn the basics of Unix

Here is another contribution that I made to a UC Davis Bioinformatics Workshop that I helped teach last week. Adapting from some our much longer Unix & Perl Primer for Biologists, I made a short bootcamp that aims to teach the basics of the Unix/Linux command-line.

Unlike the Primer material that was written from the point of view of someone using a Mac, the new bootcamp course is written from the viewpoint of someone using Ubuntu Linux. Also, no example files are needed. The course is entirely self-contained and should take 1–3 hours to process (depending on your familiarity with Unix).

Download the PDF, view the HTML version, or work with the underlying Markdown file.

Current version: v1.01 — 2015-06-24

Regarding the current state of bioinformatics training

Todd Harris (@tharris) is on a bit of a roll at the moment. Last month I linked to his excellent blog post regarding community annotation, and today I find myself linking to his latest blog post:

Todd makes a convincing argument that bioinformatics education has largely failed, and he lists three reasons for this, the last of which is as follows:

Finally, the nature of much bioinformatics training is too rarefied. It doesn’t spend enough time on core skills like basic scripting and data processing. For example, algorithm development has no place in a bioinformatics overview course, more so if that is the only exposure to the field the student will have.

I particularly empathize with this point. There should be a much greater emphasis on core data processing skills in bioinformatics training, but ideally students should be getting access to some of these skills at an even earlier age. Efforts such as the Hour of Code initiative are helping raise awareness regarding the need to teach coding skills — and it's good to see the President join in with this — but it would be so much better if coding was part of the curriculum everywhere. As Steve Jobs once said:

"I think everybody in this country should learn … a computer language because it teaches you how to think … I view computer science as a liberal art. It should be something that everybody learns, takes a year in their life, one of the courses they take is learn how to program" — Steve Jobs, 1995.

Taken from 'Steve Jobs: The Lost Interview

Maybe this is still a pipe dream, but if we can't teach useful coding skills for everyone, we should at least be doing this for everyone who is considering any sort of career in the biological sciences. During my time at UC Davis, I've helped teach some basic Unix and Perl skills to many graduate students, but frustratingly this teaching has often come at the end of their first year in Grad School. By this point in their graduate training, they have often already encountered many data management problems and have not been equipped with the necessary skills to help them deal with those problems.

I think that part of the problem is that we still use the label 'bioinformatics training' and this reinforces the distinction from a more generic 'biological training'. It may once have been the case that bioinformatics was its own specialized field, but today I find that bioinformatics mostly just describes a useful set of data processing skills…skills which will be needed by anybody working in the life sciences.

Maybe we need to rebrand 'bioinformatics training', and use a name which better describes the general importance of these skills ('Essential data training for biologists?'). Whatever we decide to call it, it is clear that we need it more than ever. Todd ends his post with a great piece of advice for any current graduate students in the biosciences:

You should be receiving bioinformatics training as part of your core curriculum. If you aren’t, your program is failing you and you should seek out this training independently. You should also ask your program leaders and department chairs why training in this field isn’t being made available to you.