somethimes we have many strains in the same fastq files with the same ID .
How to detect if there is a mix of strains in my fastq or vcf file ?
Thank you
somethimes we have many strains in the same fastq files with the same ID .
How to detect if there is a mix of strains in my fastq or vcf file ?
Thank you
Do you mean you have a bacterial isolate which you sequenced and now actually think is a contamination, i.e two bacteria ?
If you have long reads (ONT, pacbio) you might be able to find haplotypes.
Else (short reads) just call all SNVs as usual. You shouldn't have any heterozygote SNVs if this is haploid, right ? So then look at the "hets" in detail. It could be multiple clonal lineages (a neighbouring microbio group sees this a lot). Sometimes you'll miss minor clones, we once missed 3 reads out of 38 in a lasR gene, which turned out to be a sublineage with a different phenotype. 3/38 does not look like a good "heterozgote" to a SNV caller.
You can use the strain-profiling tool PStrain. You can use BWA to map the fastq to a reference, and call SNPs, then you can use the single_species.py in PStrain, it will profile the strains with different strain numbers and choose a reasonable one. So, you can konw if there is a mix of strains. Hope it helps.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What do you mean? You want to identify species or strains from metagenomic samples? Or is it a different problem? Could you please provide (a lot) more details?
If the strains are of the same species there is little chance you are going to be able to tell the reads belonging to them apart.