Entering edit mode
8.4 years ago
rubic
▴
270
Hi,
I have a bam file which ~50% of it is ambiguously mapped reads (since the library is from small RNAs). For each feature (e.g., gene) in my GTF file I'd like to add a count of 1 if a read from that bam file maps to that feature uniquely and 1/n if a read maps to n locations (that feature being one of them).
Is there any software of preferably an R package that can be used for that?
I know this gives a read count for every feature, and one can use either uniquely or ambiguously mapped reads but that still doesn't give exactly what I need.
Thanks a lot
I think counting non-uniquely mapped reads is still a challenge due to the potential high copy number of possible reads in the genome. Most aligners at best report a maximum number of positions the read can be mapped to (some only reports a zero mapping quality, so we don't know n)
You might want to look at this paper http://www.ncbi.nlm.nih.gov/pubmed/25012247 where the authors discussed "fractional counts" which I think is probably what you want though they were targeting longer repetitive elements.
I ran into a similar issue some times ago and found Rcount. However, it does more than just giving fractional counts because it assign multi-mappers based on unique local coverage.