Entering edit mode
5.3 years ago
swidler
•
0
I'm reading a bam file using Pysam, but I couldn't find one file for the whole genome, so I'm iterating over the chromosomes. When I fetch a region by chromosome, only the file-specific chromosome will get results (ie if I fetch "21" in the chrom 1 file, I shouldn't get any results). Is there a way to determine at the fetch that there's no data so I can move on? Thanks.
I found count(region), which does more or less what I want (ie it tells me if the region returns nothing), but running it on a non-empty region takes a while and I'm hoping there's a better way.
The most simply way would be to name the files that way, that you know which chromosome is in.
Another way would be to use pysam's get_index_statistics() which will give you the number of mapped, unmapped and total reads per contig.