How to extract uniquely aligned paired end reads obtained from freebayes using diffrent parmeter combinations of samtools?
0
0
Entering edit mode
4.4 years ago

I have 10x reads and I need uniquely aligned reads for SNP calling. I am using samtools to extract uniquely aligned reads but I am not satisfied by the results. It would be great if someone can share the exact way of filtering these reads. I tried many combinations and checked many sources online but could not found any satisfactory answer. Here are the parameters that I already tried.

     step:01 #Initial stats after alignment using freebayes:
        172387804 + 0 in total (QC-passed reads + QC-failed reads)
        9715270 + 0 secondary
        0 + 0 supplementary
        22559490 + 0 duplicates
        165148332 + 0 mapped (95.80% : N/A)
        162672534 + 0 paired in sequencing
        81336267 + 0 read1
        81336267 + 0 read2
        109869698 + 0 properly paired (67.54% : N/A)
        152047167 + 0 with itself and mate mapped
        3385895 + 0 singletons (2.08% : N/A)
        25207784 + 0 with mate mapped to a different chr
        20769368 + 0 with mate mapped to a different chr (mapQ>=5)

 #step:02: Picard tools was used to mark and remove duplicates
     146765957 + 0 in total (QC-passed reads + QC-failed reads)
     9715270 + 0 secondary
     0 + 0 supplementary
     0 + 0 duplicates
     139526485 + 0 mapped (95.07% : N/A)
     137050687 + 0 paired in sequencing
     68390482 + 0 read1
     68660205 + 0 read2
     91962549 + 0 properly paired (67.10% : N/A)
     128054904 + 0 with itself and mate mapped
     1756311 + 0 singletons (1.28% : N/A)
     21673676 + 0 with mate mapped to a different chr
     18001625 + 0 with mate mapped to a different chr (mapQ>=5)

       # To get uniquely aligned reads "parameters1:-h -q 20 -F 256 test.bam"
        107311384 + 0 in total (QC-passed reads + QC-failed reads)
        0 + 0 secondary
        0 + 0 supplementary
        0 + 0 duplicates
        107311384 + 0 mapped (100.00% : N/A)
        107311384 + 0 paired in sequencing
        53437167 + 0 read1
        53874217 + 0 read2
        80455227 + 0 properly paired (74.97% : N/A)
        105892555 + 0 with itself and mate mapped
        1418829 + 0 singletons (1.32% : N/A)
        15343533 + 0 with mate mapped to a different chr
        15343533 + 0 with mate mapped to a different chr (mapQ>=5)

        ##2nd try:parameters2: -q 20 -f 0x02
        82937879 + 0 in total (QC-passed reads + QC-failed reads)
        2482652 + 0 secondary
        0 + 0 supplementary
        0 + 0 duplicates
        82937879 + 0 mapped (100.00% : N/A)
        80455227 + 0 paired in sequencing
        40153047 + 0 read1
        40302180 + 0 read2
        80455227 + 0 properly paired (100.00% : N/A)
        80455226 + 0 with itself and mate mapped
        1 + 0 singletons (0.00% : N/A)
        0 + 0 with mate mapped to a different chr
        0 + 0 with mate mapped to a different chr (mapQ>=5)

        #3rd try:
        -f 0x02 -bq 1 (properly paired end reads)
        93036316 + 0 in total (QC-passed reads + QC-failed reads)
        2817262 + 0 secondary
        0 + 0 supplementary
        0 + 0 duplicates
        93036316 + 0 mapped (100.00% : N/A)
        90219054 + 0 paired in sequencing
        45085486 + 0 read1
        45133568 + 0 read2
        90219054 + 0 properly paired (100.00% : N/A)
        90219033 + 0 with itself and mate mapped
        21 + 0 singletons (0.00% : N/A)
        0 + 0 with mate mapped to a different chr
        0 + 0 with mate mapped to a different chr (mapQ>=5)

        #4th: -F 3852
        105892555 + 0 in total (QC-passed reads + QC-failed reads)
        0 + 0 secondary
        0 + 0 supplementary
        0 + 0 duplicates
        105892555 + 0 mapped (100.00% : N/A)
        105892555 + 0 paired in sequencing
        52606133 + 0 read1
        53286422 + 0 read2
        80455226 + 0 properly paired (75.98% : N/A)
        105892555 + 0 with itself and mate mapped
        0 + 0 singletons (0.00% : N/A)
        15343533 + 0 with mate mapped to a different chr
        15343533 + 0 with mate mapped to a different chr (mapQ>=5)

        #5th: -q 20 -F 268
        105892555 + 0 in total (QC-passed reads + QC-failed reads)
        0 + 0 secondary
        0 + 0 supplementary
        0 + 0 duplicates
        105892555 + 0 mapped (100.00% : N/A)
        105892555 + 0 paired in sequencing
        52606133 + 0 read1
        53286422 + 0 read2
        80455226 + 0 properly paired (75.98% : N/A)
        105892555 + 0 with itself and mate mapped
        0 + 0 singletons (0.00% : N/A)
        15343533 + 0 with mate mapped to a different chr
        15343533 + 0 with mate mapped to a different chr (mapQ>=5)

        #6th:-q 20 -F 0x100
        107311384 + 0 in total (QC-passed reads + QC-failed reads)
        0 + 0 secondary
        0 + 0 supplementary
        0 + 0 duplicates
        107311384 + 0 mapped (100.00% : N/A)
        107311384 + 0 paired in sequencing
        53437167 + 0 read1
        53874217 + 0 read2
        80455227 + 0 properly paired (74.97% : N/A)
        105892555 + 0 with itself and mate mapped
        1418829 + 0 singletons (1.32% : N/A)
        15343533 + 0 with mate mapped to a different chr
        15343533 + 0 with mate mapped to a different chr (mapQ>=5)

Anyone suggest me the right parameter out of these tries.
I am loosing lot of reads. Is there any better way or parameter to solve this issue?
Thanks
NGS read filtering Samtools freebayes • 887 views
ADD COMMENT

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6