Variants in untargeted genes identified after targeted exome sequencing analysis
0
0
Entering edit mode
3.2 years ago
vinayjrao ▴ 250

Hi,

I recently analyzed some targeted exome sequencing samples, which were provided to us by our collaborators, for which I do not possess the target gene list. Upon analysis, I am informed that some of the genes - whose variants were identified - were not present in the target gene list. Has anyone ever faced such an issue, or have any idea why I might be observing these variants?

If it helps, I had found duplicate entries (both name and sequence) in some raw fastq files, so I had removed them using seqkit rmdup. Since I don't know whether all variants in untargeted genes exist exclusively in these files, I can't even be sure that removing the duplicate entries could be causing an issue with the alignment and/or variant calling.

The pipeline used was - fastqc --> trim_galore (while preserving only the paired reads, and not singular reads) --> seqkit rmdup -n (to remove duplicate entries based on name) --> bwa_mem using hg38 as the reference --> picard to sort_sam, mark and remove PCR duplicates --> variant calling with GATK4 --> BQSR with GATK4 --> applying BQSR on bam file with GATK4 --> variant calling from the recalibrated bam file created in the previous step using GATK4 --> annotation using hg38 as a reference with wANNOVAR

Thanks in advance.

snp exome • 521 views
ADD COMMENT

Login before adding your answer.

Traffic: 1499 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6