Entering edit mode
8.0 years ago
John
13k
FASTA/FASTQ have 1 entry per sequenced read, such that a paired-end sequenced fragment will have two entries (usually in different files)
SAM/BAM files have 1 entry per read alignment, such that a paired-end sequenced fragment can have 1, 2, or 3+ entries.
Is there a format that is just 1 entry per fragment? Can contain alignment info or not.
Thanks very much :)
It is possible to get 4 entries per sequenced fragment if one separates the two tag reads into independent files (possible to do with Illumina data).
Where are we heading with this question by the way?
PS: Has the thesis been submitted?
Are we counting fastq files that have mates one after another in the same file? In that case, it's basically 8 lines per fragment.
And what genomax2 said regarding the thesis.
genomax - right, right, but i was looking for something where 1 DNA fragment goes into the sequencing machine, 1 row appears in the output file. I'm only curious because I was asked today at work if such a file format existed and I had no idea :-/
Devon - Like an interleaved FASTQ? Yeah that would certainly work, but I was hoping for a format that I could point to and say "this format has 1 entry per sequenced fragment". They just asked out of curiousity and after a brief search i couldn't find anything :(
Single-end sequencing would get you one record per fragment :)
If it is possible to merge the two reads (by overlap or local assembly as BBMerge is able to do) then you would also be able to get one record per fragment (for some).