N datasets using DiscoSNP++
2
0
Entering edit mode
9.7 years ago
ayang • 0

Hello,

I am looking into DiscoSNP++ and would love to use it for a set of 9 samples. I just had a few questions before using this tool:

  1. Does DiscoSNP++ support PE reads now? I am just wondering, because I have PE reads and would love to use the paired-ends to remove duplicate reads. I know that DiscoSNP did not support PE reads.
  2. If I run all 9 samples together, will I be able to know which sample has which allele from the output file?
  3. Is DiscoSNPP++ able to output VCF files or is there a way to do that?

Thanks!

Ashley

discosnp reference-free-snp-call • 2.3k views
ADD COMMENT
1
Entering edit mode
9.7 years ago

Hello Ashley,

Here are some answers for your questions:

  1. Disco still not supports PE reads. To be more precise: if you've 9 read sets, I guess that you disposes in practice from 18 files. Currently you could apply discoSnp++ on the 18 files together. You'll have to sum the coverages values of each couple of PE read sets in order to retrieve the result you're looking for. We will propose soon a release doing this automatically.

    I'm not sure to understand how and why you'd like to remove duplicate reads.

  2. Yes (see doc). Moreover, with the current version (2.0.6) we propose genotype results.

  3. The next release (probably next week or the one after) will contain a vcf output. You'll have the possibility to provide a reference genome (close or not) in order to localize the predicted variants. If you don't have a reference genome, the VCF will have the same format, but some '.' will replace the mapping information.

Best,
Pierre

ADD COMMENT
1
Entering edit mode
9.7 years ago

Hello Ashley

Good news, the new discoSnp++ version should answer most of your comments:

  1. discoSnp++ now considers pair of read sets (see this thread).
  2. discoSnp++ genotypes per allele and per read set (results are stored both in the fasta and in the output vcf (see next))
  3. discoSnp++ now automatically generates a vcf.
    • If no reference genome is provided, then the chromosome and positions fields are empty ('.')
    • If a reference genome is provided, a mapping done using bwa and discoSnp++ generates a VCF from the mapping results.

Best,
Pierre

ADD COMMENT

Login before adding your answer.

Traffic: 1939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6