Hi everyone,
I have been struggling with this for a while. I got some published data and it appears to be in the map format from bowtie. I have converted it to SAM file using samtools' bowtie2sam.pl script from the misc directory.
I convert to BAM and everything seem to be fine using:
samtools view -bT arabidopsis.fa in.sam > out.bam
Except I get something along: Char1 treated as '*'
However, when I try to load the data into IGV for example to view along the genome. It loads, but nothing shows up at any zoom level.
I have also tried loading just the SAM file into IGV and that works fine. However, not the BAM file generated from it.
I tried to validate the file with both samtools and other published tools and it returns:
22774604 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
22774604 + 0 mapped (100.00%:nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (nan%:nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (nan%:nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
The only thing I really care about is trying to get total read counts out of the data, over a specified region. But again haven't got to that yet. So any ideas on how to extract read count and this BAM issue would be great.
Thanks
Alex
check your sequences names/ids, I think you have that wrong.
As everyone else said in this thread, make sure you use the same reference for mapping and for generating BAM.