Entering edit mode
15 days ago
Juls
•
0
Hi everyone,
I want to extract fragment lengths mapping to genomic regions of interest and then get some custom stats.
I found some examples of code doing similar stuff -getting the fragments lengths- but I believe that fragments are being count twice as both read1 and read2 are being considered. Am I missing anything here?
- https://github.com/denniepatton/TritonNP/blob/main/scripts/GenerateFFTFeatures.py
- https://github.com/GavinHaLab/CRPCSubtypingPaper/blob/main/FragmentAnalysis/GenerateFragmentFeatures.py
segment_reads = bam.fetch(bed_tokens[0], start_pos, stop_pos)
for read in segment_reads:
fragment_length = read.template_length
if frag_range[0] <= np.abs(fragment_length) <= frag_range[1] and read.is_paired and read. \
mapping_quality >= map_q and not read.is_duplicate and not read.is_qcfail:
fragment_lengths.append(abs(read.template_length))
Maybe modifying above code to include read.is_read1 and read.is_properly_paired as condition to avoid counting both reads?
Thanks!!