Entering edit mode
3.6 years ago
daewowo
▴
80
I aligned to a genome using bowtie2 and saved unaligned reads to a sam file. That sam file has no header. I now want to align this 'unaligned' sam file to a genome but I cant convert to fastq using gatk SamToFastq as gatk needs a header. Any way I can get around this?
reformat doesnt like the headerless sam either
java.lang.AssertionError: Missing field 1: GGCAGCACGGAGCCAGGCCAATGAGGGGACCCCACCTGGACGCCATCGCCACCCAGGGCCAGACCATGGGGCGGGCTGCAGGGTGTGGGCCAGGTGCTGGGAGGGGCAGGGGCAGGGGCAGAGGAGGAAGTGAGGTCCTGGCTCCAATCC at stream.SamLine.<init>(SamLine.java:491) at stream.SamReadInputStream.toReadList(SamReadInputStream.java:119) at stream.SamReadInputStream.fillBuffer(SamReadInputStream.java:90) at stream.SamReadInputStream.hasMore(SamReadInputStream.java:54) at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:667) at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:656)
The actual first line of the sam file is
@SRAnumber.n.n followed by base sequence as shown in the error above
Can you post actual 2-3 lines from the file. Sounds like it is not in SAM format.
If actually looks like fastq already here is first read
Correct. So no conversion needed.
In any case, I think you could just add any dummy header to that SAM file to make something like
samtools fastq
work. Afaik the header has (in sam>fastq conversion) basically no meaning despite satisfying the sanity check that the tools run up front. (untested)