pysamstats produces 'inaccurate' variants frequencies?
0
0
Entering edit mode
4.8 years ago
Lhl ▴ 760

Hi There ,

Have you ever compared the variants frequencies computed using pysam with that computed simply through perl regex? I recently found some discrepensy between the two.

In the example below: With perl regex, the number of deletions counted from pileup file is 128, but in the pysamstats results, it is 5. Can anyone explain why? Maybe pysamstats did some filtering?

Thanks a lot.

+++++++++++++++++++++++++++ ===============Pileup String ============= +++++++++++++++++++++++++++

...-1G....-1G.........+1G...........................-1G........-1G...........................-1G..............-1G.-1G.....-1G........-1G................-1G............C..C.............-1G...........-1G..................................-1G..................................-1G............-1G..-1G...................-1G.............-1G............................................................................*.......-1G.-1G.....-1G..............................-1G........................+1G.-1G.-1G....................................-1G......-1G..-1G.......+1G...........-1G.+1G......-1G....................+1G.........................-1G...........-1G..........-1G....................-1G........................-1G...........................................-1G..............-1G...-1G............................-1G.......-1G........-1G......................................................-1G.............-1G...-1G.-1G..................-1G......-1G.-1G...................+1G.................................-1G...-1G...................................................-1G......................+1G.................-1G...................-1G.................-1G.............-1G.........-1G............-1G........-1G....................................-1G..............-1G..-1G..-1G......+1G.............-1G......-1G..........................-1G..-1G......-1G.....-1G.......-1G................-1G....-1G..-1G................-1G.........................+1G....................................+1G...................+1G................-1G........-1G....................................-1G.-1G.....-1G......-1G.....-1G........-1G............................C..............-1G.......-1G......-1G..........-1G...-1G..............+1G....+1G...........................-1G.......-1G......-1G..........+1G.-1G.............-1G.-1G........-1G....-1G......................-1G.....-1G.....-1G.-1G..-1G.-1G...........-1G................-1G.....+1G...............-1G...............+1G....-1G.......-1G....................+1G........-1G...........-1G..+1G..............................+1G...........................-1G.................-1G..-1G.........+1G...............-1G...-1G.........-1G........................................-1G....-1G..............................................-1G...+1G...........................-1G...-1G..........-1G..+1G.-1G.-1G...-1G...................................................-1G............-1G.....-1G........................-1G........................................+1G............................-1G.-1G......-1G...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................^].^].^".^].^].^].^3.^].^'.^2.^].^].^].^].^].^].^].^].^].^D.^].^].^7.^".^].^].^;.^].^5.^N.^].^].^].^].^].^].^].^C.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^V.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^6.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^A.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^=.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^R.^].^].^].^].^C.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^>.^N.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^Y.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^H.^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].^].

SNP • 929 views
ADD COMMENT
0
Entering edit mode

Please add representative code. Anecdotal methods are often (never) sufficient to reproduce things.

ADD REPLY
0
Entering edit mode

Hi ATpoint, thanks for reminding me to post the commands.

Copy the string into a file -- String.txt perl regex:

perl -ne 'chomp; @a = $_ =~ /\-\d+[AGCTN]/g; @b =  $_ =~ /\+\d+[AGCTN]/g; print scalar (@a),"\t",scalar (@b),"\n";' String.txt
128     23

pysamstats commad:

pysamstats pysamstats --type variation_strand --fasta ref.fasta in.bam 

chrom   pos     ref     reads_all       reads_fwd       reads_rev       reads_pp        reads_pp_fwd    reads_pp_rev    matches matches_fwd     matches_rev       matches_pp      matches_pp_fwd  matches_pp_rev  mismatches      mismatches_fwd  mismatches_rev  mismatches_pp   mismatches_pp_fwdmismatches_pp_rev        deletions       deletions_fwd   deletions_rev   deletions_pp    deletions_pp_fwd        deletions_pp_rev        insertionsinsertions_fwd  insertions_rev  insertions_pp   insertions_pp_fwd       insertions_pp_rev       A       A_fwd   A_rev   A_pp    A_pp_fwd        A_pp_rev  C       C_fwd   C_rev   C_pp    C_pp_fwd        C_pp_rev        T       T_fwd   T_rev   T_pp    T_pp_fwd        T_pp_rev        G       G_fwd     G_rev   G_pp    G_pp_fwd        G_pp_rev        N       N_fwd   N_rev   N_pp    N_pp_fwd        N_pp_rev
contig_2244      40      T       3789    3789    0       0       0       0       3781    3781    0       0       0       0       3       300       0       0       5       5       0       0       0       0       23      23      0       0       0       0       0       0       0       000       3       3       0       0       0       0       3781    3781    0       0       0       0       0       0       0       0       0       000       0       0       0       0

So number of deletions is 5 and that of insertions is 23. 
ADD REPLY

Login before adding your answer.

Traffic: 2535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6