variant calling ecoli
0
0
Entering edit mode
4.7 years ago
lima80in • 0

hi I'm learning and doing an exercise on variant calling in e coli I'm an amature and need help I started with 6 files fastq format for ecoli from a study in ena i also downloaded the genome file fasta format for that particular strain i did the follwing in the sequence fastq runs, trimmomatic, qc on trimmed data, BWA index on the fasta file, samtools faidx, samtools dict, align trimmed data to the reference using bwa mem (i get a sam file), convert, sort, index using samtools, next I used GATK to mark duplicates, GATK to add or replace groups,

next im supposed to use GATK BaseRecalibrator for which i need a known site reference for polymorphisms in ecoli this is a vcf file

how am i supposed to get this file or arrive at the step

the ecoli strain I'm looking at is ecoli rel606

SNP Assembly • 1.4k views
ADD COMMENT
0
Entering edit mode

With bacteria you can use a simple variant calling procedure (without the base recalibration). Use callvariants.sh from BBMap suite as an alternative.

ADD REPLY
0
Entering edit mode

There are lots of pipelines and software for this, but as @genomax says, GATK is overly complex and not well suited.

Your pipeline sounds good so far, I'd recommend freebayes or samtools snp-calling, else maybe snippy as a pipeline.

ADD REPLY
0
Entering edit mode

Thank you genomax!! I'll explore BBMap the reason for using GATK was because this was taught in class for human samples and i was trying the same with ecoli

ADD REPLY

Login before adding your answer.

Traffic: 1839 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6