Ad-hoc tasks in bioinformatics can contain such an immense number of operations and tasks that need to be performed to achieve a certain goal. Often these are all individually regarded as rather “standard” or “routine”. Despite this, it is quite hard to find an authoritative set of “recipes” for how to do such tasks.
Thus I was starting to think that there needs to be a collection of bioinformatics “recipes”. A sort of “cookbook” for common bioinformatics tasks.
As we are - according to some expert opinions - living in the Century of Biology, I found it interesting to reflect on Go’s usage within the field.
Go has some great features that make it really well suited for biology, such as:
A relatively simple language that can be learned in a short time even for people without a CS background. This is super important aspect for biologists. Fantastic support for cross-compilation into all major computer architectures and operating systems, as static, self-sufficient executables making it extremely simple to deploy tools, something that can’t be said about the currently most popular bio language, Python.
I read this excellent article with practical recommendations on how to organize a computational project, in terms of directory structure.
Directory structure matters The importance of a good directory structure seems to often be overlooked in teaching about computational biology, but can be the difference between a successful project, and one where every change or re-run of some part of a workflow, will require days of manual fiddling to get hand on the right data, in the right format, in the right place, with the right version of the workflow, with the right parameters, and then succeed to run it without errors.
It turned out I didn’t have the time and strength to blog every day at the NGS Bioinformatics Intro course, so here comes a wrap up with some random notes and tidbits from the last days, including any concluding remarks!
These days we started working on a more realistic NGS pipeline, on analysing re-sequencing samples (slides , tutorial ).
First some outcome from this tutorial What do I mean with “outcome”?
Today was the second day of the introductory course in NGS bioinformatics that I’m taking as part of my PhD studies.
For me it started with a substantial oversleep, probably due to a combination of an annoying cold and the ~2 hour commute from south Stockholm to Uppsala and BMC . Thus I missed some really interesting material (and tutorial ) on file types in NGS analysis, but will make sure to go through that in my free time during the week.
Just finished day 1 of the introductory course on Bioinformatics for Next generation sequencing data at Scilifelab Uppsala. Attaching a photo from one of the hands-on tutorial sessions, with the tutorial leaders, standing to the right.
Today’s content was mostly introductions to the linux commandline in general, and the UPPMAX HPC environment in particular, an area I’m already very familiar with, after two years as a sysadmin at UPPMAX. Thus, today I mostly got to help out the other students a bit.
Right now I’m sitting on the train and trying to get my head around some of the pre-course materials .