2011-10-21

FASTQ must die! Long live SAM/BAM!


I think it is time to retire the FASTQ file format in favour of storing unaligned reads in SAM/BAM format. I will try to explain, as this may not immediately strike everyone as logical, given SAM/BAM is primarily a sequence alignment/mapping format, while for "raw" reads FASTQ is near ubiquitous in Next Generation Sequencing (NGS), more sensibly known as High Throughput Sequencing (HTS).

2011-10-03

SAM/BAM without gapped reference

In my last post I talked about SAM/BAM with a gapped reference, and how this makes it much easier to work with inserted bases relative to the reference/consensus - especially for visualisation.

I should point out that some viewers do actually manage to show the inserts as columns even with the traditional ungapped/unpadded reference sequence - notably Gap5, Bambino, and the text based samtools tview, as shown in these tview screenshots. You press the "i" key to toggle this insert display, press "?" for help.