Reinventing the wheel, badly

February 19th, 2007

I spent several hours today implementing a sequence analysis method taken from a paper I had read earlier. I created a database, downloaded yeast coding sequences, then coded the whole method up in java. Shortly after doing this, a google search showed that not only had a tool been published to perform this analysis, but that the original method I implemented was flawed.

I think the lesson here is to always check what the goal is of the research. Before touching the keyboard. Do you sometimes find that it’s easy to get bogged down in the individual details of implementation, such as coding, rather than the higher scientific question. Coding is enjoyable and is one of the reasons why I like bioinformatics. But what I have to keep telling myself is that in the end it’s about the science rather than the details of implementation. If someone else has already created a tool, or even better produced the results I’m after, then I can skip this step and start the intended analysis. Plus my implementation is probably a lot worse than someone who has given the problem a lot more consideration.

3 responses

  1. John Major comments:

    Very true!

    I try to drive this point to death with the students who take our programing and informatics tools workshops. Always spend time hunting for existing solutions.

  2. Mike comments:

    Thanks John.
    I wish I could say that this is the last time I’ll make this mistake, but probably not.
    One way to get around this might be to do a google search each time I plan on creating some code.
    The open-bio libraries are usually a good source of pre-created functions too.

  3. Bioinformatics Zen » How to avoid errors when processing CSV files pings back:

    [...] Importantly by using a third party library, you implement another programming best practice which is, don’t reinvent the wheel. [...]

Leave a comment