Entering edit mode
2.3 years ago
Nemo
•
0
Hi,
I have a dataset of a group of people detected with different types of covid disease. Also, I have some normal samples. I have aligned using hisat2 the reads to the reference of wuhan virus and used gatk HC and lofreq separately to get the variants. However, in the final results when I compare with identified variants for delta and omicron, there are a high number of these variants in some normal samples.
I wonder to know why this happens? and how I can control to not have covid variants in normal samples?
Was the sequencing done specifically only to look for COVID? When you say you have "normal" samples do you simply mean that these are people who did not show any COVID symptoms at the time of sample collection? Is every "normal" sample showing this phenomenon? If so there may be some kind of a mix-up in the lab?
Thank you @genomax. Yes I guess. The normal cases are those without any symptoms of Covid. Is there any way to control this in the alignment phase? any specific flag or parameter?
I should classify the patients based on my results, with this in mind what is your suggestion?
One should not be seeing more or less any COVID reads in "normal" samples, let alone both variants, correct? I don't know how long the virus sticks around (does it?) after a person gets over COVID, in case these "normals" are people who had past infection and have since recovered.
The normal cases are those who didnt have Covid at the time of sample collection (no covid before that as well). Is there any way?
Sorry but what way are you referring to? If you are getting real hits to COVID (you are sure that the hits are real?, I guess there must be enough reads if you are able to call SNP) in your "normals" then this is something you should discuss with experimental people right away.
I got you. Yeah it might relates to data. Thanks