Entering edit mode
3.2 years ago
Umer
▴
130
Hey, I have a sorted bam file which i got by mapping with reference. I want to identify two things from this bam file.
- How many reads mapped to each gene/contig in my bam file.
- As this was paired-end data, i want to know how many genes/contigs just got a single read out of a pair mapped. like Forward read mapped to contig but reverse didn't.
I'm trying to find the answers using Pyhton library Pysam. My Bam file headers are
@HD VN:1.5 SO:coordinate @SQ SN:TRINITY_DN30590_c0_g2 LN:223 @SQ SN:TRINITY_DN30590_c0_g3 LN:331
and bam file looks like this.
HWI-1KL178:42:c39jdacxx:3:2308:16762:5814/1 16 TRINITY_DN30590_c0_g2 1 9 19S55M27S * 0 0AGGTTTAATTGCAGAAGCTGAGCCTTTCATCTCCATCTTGGGGGTTTCCAGTTCTGCTAGCTGTGCCTCCAAAACACTGGCCTTACTGTATAATTCATCAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFIIFIIIIIIIFIIIIFFFFFFFFFFBBB NM:i:0 ms:i:110 AS:i:110 nn:i:0 tp:A:P cm:i:9 s1:i:46 s2:i:0 de:f:0 rl:i:0
I'm trying this Python code:
import pysam
file = pysam.AlignmentFile("AT165.bam", "rb")
for read in file.fetch():
print(read)
file.close()
Any help will be great. Thank you.
That's not paired end data.
i'm pretty sure i used both R1 and R2 for mapping part.
https://broadinstitute.github.io/picard/explain-flags.html
your read has flag 16. So the flag 1 (paired-end-data) is not checked, it's not paired-end data.