Question

Getting sequences fragments using pysam

0

Entering edit mode

3 months ago

Juls • 0

Hi everyone,

I want to extract fragment lengths mapping to genomic regions of interest and then get some custom stats.

I found some examples of code doing similar stuff -getting the fragments lengths- but I believe that fragments are being count twice as both read1 and read2 are being considered. Am I missing anything here?

    segment_reads = bam.fetch(bed_tokens[0], start_pos, stop_pos)
    for read in segment_reads:
        fragment_length = read.template_length
        if frag_range[0] <= np.abs(fragment_length) <= frag_range[1] and read.is_paired and read. \
                mapping_quality >= map_q and not read.is_duplicate and not read.is_qcfail:
            fragment_lengths.append(abs(read.template_length))

Maybe modifying above code to include read.is_read1 and read.is_properly_paired as condition to avoid counting both reads?

Thanks!!

sequencing pysam reads wgs • 382 views

ADD COMMENT • link updated 3 months ago by i.sudbery 21k • written 3 months ago by Juls • 0

score 1 · Answer 1 · 2025-04-08

1

Entering edit mode

3 months ago

i.sudbery 21k

Yes, including a check for read.is_read1 should be sufficient.

ADD COMMENT • link 3 months ago by i.sudbery 21k