Entering edit mode
6.5 years ago
de.mecquenem.ninon
▴
10
Hello ! I have contigs data in FASTA format that I have downloaded from PATRIC database. I would like to spot SNPs/Indels in those sequences. As I am quite new in this domain and I don't really know which tool or protocol to use. So if you have any advice... :)
Thank you for your help !
My apology, this is no answer, but I'm curious and would love to know available solutions as well. The best a quick search yielded is Heng Li's fermikit. Maybe you can skip the assembly stage and feed it the assemblies?
The Fermikit works very well. It's still not made for contigs/supoercontigs data, but it looks like a serious tool and I have the good VCF format output ! Thank you for your help !
For now I am trying a BWA-MEM (to get bam files) then GATK (haplotypeCaller). But it does not work very well (Error messages while running HaplotypeCaller), and I am not even sure it is the best protocol in this case. I'll have a look on fermikit, thank you very much !
Hello,
I don't know whether the restriction mentioned in this post still exists. So be careful.
fin swimmer
Hi de.mecquenem.ninon,
This reply is better suited as a comment on Carambakaracho's answer. Answers should ONLY be used to respond to the original question at the top of this page. I moved your post to a comment. But as you see this is not perfect.
fin swimmer
Deeply sorry, it was my first post and I havn't noticed the different ways of answering... Thank you for your answer !
Variant calling without much effort can be done with Freebayes although you need to filter for false positive calls(vcffilter step). However, as a start, it is best to go through the best practices offered by GATK since pre-processing makes a huge difference.