Posts tagged with publishing

Bioinformatics : Sometimes you have to write, and not computer code either

March 2nd, 2007

You love making programs that reveal yourself as a programming god. You produce results that demonstrate you are, quite obviously, the greatest scientist of our generation. You show these to everyone in your office/corridor/family, they all appreciate your magnificence. After a few talks at conferences comes the bit you’ve been avoiding - writing everything up.

I think I’m fair in saying that you went into bioinformatics for something other than writing reports and papers. At some point though, you have to.

Set a daily goal
When I have to write, I set a goal of 500 words a day. Even if they are complete nonsense, which they usually are. Writing, like everything else, gets better with practice. My writing usually becomes more coherent, the more I do. Once you’ve got everything written, even if it’s rubbish, you’ll feel much better. You can start editing for clarity later.

Throw away as much as possible
You’ve written huge amounts of text, so why bin all the effort? Because you’re writing for other people. Everyone one has to expend energy to read your work. Think about how dull it is for you reading badly written text. It’s a boring activity that you have to force yourself to do. Now what if the document had instead been composed with the reader in mind? The point is put across in clear and simple terms. Every paragraph, sentence, word is there because it has to be.

Learn how to write well
Writing, like presenting, is a skill that doesn’t receive as much attention as it should. Particularly in science where, quite rightly, the emphasis is scientific method and ability. But once you have your results, you need to communicate them. I’ve found that becoming better at writing makes you more confident, making writing more enjoyable and a less arduous task.
To this end, I recommend Strunk and White, “The elements of style”. Fortunately, this book is also available online.
Also check out this page for a practical introduction to writing.

Finally, if you’re taking one thing away from this post, please take this.

Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that he avoid all detail and treat his subjects only in outline, but that every word tell.
William Strunk Jr., Elements of Style

Forty two words were thrown away in revising this post. Coincidence?

The right graph, at the right time

February 28th, 2007

I think everyone would agree that the most important thing in science is results. The best scientists produce the most relevant and important results. Of course, the best results won’t matter if no one knows about them. Which is why we publish and give presentations.

Sometimes I see results in papers and presentations illustrated poorly. Graphs that don’t demostrate the point to the reader/audience in the best possible way. Here I give examples of how data can be presented in different contexts, based on two of my favorite resources. The first is the R language for statistics, the other is Garr ReynoldsPresentation Zen ideology.

A bad example
Here’s an extreme case, but not completely uncommon in presentations. Two continuous variables - the oxidation of ammonia to nitric acid, and air flow. The chart was produced, using default options, in NeoOffice.
Office example
My initial complaint, is the inappropriate x axis - the first half of the plot isn’t being used. The axis should begin around 40, where the data starts.

Next, the unattractive grey background and horizontal black lines. I personally find this style unpleasant, and would recommend that these always be removed.

Finally, the trend-line, the magenta color is not particularly nice, and why is it so thick? The wide line makes the chart look clunky and inelegant. If you’re making a chart, you want people to look at it, and appreciate the data. You’ve spent months slaving away to produce a set of results, so why not put the extra effort into presenting them well?

Producing a graph for a paper
Here is the same data produced using the default plot function in R.
Rexample

What strikes me about R plots, is how clean they appear. You could argue that it looks rather spartan, but the chart shows the data and nothing else. There are no frills, but then you want to illustrate your results efficiently. If the results aren’t that good, then no amount of fluffing will make them better. On the other hand if the results are good, extra decoration distracts from the main point.

Producing a graph for a presentation
Controversial, but I say don’t. If you can use a simpler way to show the result, do it. When looking at a chart in a paper, the reader has time to read the legend and think about what point it illustrates. I look at all the figures in a paper at least twice.

On the other hand, when presenting, you’ve usually got a limited time to get your point across. When you show a chart in a presentation the audience has to look at many things, the axis, points, trend-lines. This could distract from you, and your message.

What do you want to do in the time you have? You want to show your work as exciting and interesting to as many people as possible. How many times have you been in a presentation where there has been slide after slide of graphs. You can imagine that audience attention drops dramatically with each new plot. Here’s an example slide to illustrate how I would show the above data.

Presentation Example

This shows the point succinctly, no distractions. Remember that you’ll be talking at the same time as well. If the audience wants more information, they can find you afterwards. You can direct them to the great figures that you included in your paper!

Of course you’ll need to include a plot to demonstrate controversial and important results. The less plots you have prior to these, the more impact they and therefore your point, will have. Garr Reynolds has some tips (point 6) on producing graphs for presentations.

Finally
I’d like to end this post by quoting the R help page on the subject of pie charts

Pie charts are a very bad way of displaying information. The eye is good at judging linear measures and bad at judging relative areas. A bar chart or dot chart is a preferable way of displaying this type of data.

Cleveland (1985), page 264: “Data that can be shown by pie charts always can be shown by a dot chart. This means that judgements of position along a common scale can be made instead of the less accurate angle judgements.” This statement is based on the empirical investigations of Cleveland and McGill as well as investigations by perceptual psychologists.

10 simple rules for getting published

February 14th, 2007

A post at Nodal point discusses a PLoS Computional Biology article on the problems of being a post doc. The post also contains another interesting PLoS reference - 10 simple rules for getting published. The main points of the article are reading papers regularly, and being objective about research, especially your own.

UPDATE : There’s also a discussion of this at Biocurious

Read more »

The dark side of bioinformatics data mining

February 13th, 2007

An anti-post to my previous post on data mining. The point, to illustrate how data mining can quickly become unrewarding, and worse demotivating.

I spend much of my day analysing yeast high throughput data, recently produced in the laboratory. On one hand I’m very lucky to have access to fresh data at many cellular levels. On the other hand, with all this information, I’m easily swept away by the amount of variables I have access to - the dark side of data mining.

Read more »