Hi everyone,
I am new to working with exome data. The bioinformaticians have already given me analysed .tsv files. In these files I find information about the annotation of the variation and its meaning according to clinvar. The problem is when I try to see the pathogenic variants because I find several outputs for the same variant, depending on whether it is a SNP or a deletion or an insertion... This may be due to the fact that different outputs are recorded in clinvar for the same position. So what is the right way out and how do you deal with this problem?
Thank you for your time. Suggestions and advice are welcome.
Can you provide an example or two of what you are finding confusing?
Of course. Attached is an image of a result for pathogenic mutations in the APC gene. As can be seen, several annotations are recorded for the same position and the same change. In this case, some annotations correspond to SNPs and others to deletions. how do I know which is the right one? How do you usually work with this kind of data?
Thank you for your time
Your file seems to be post-processed in some way. If I try to look this variant up in
clinvar
I am seeing entries that have multiple genes in the row submissions. Perhaps your file contains just that position?There is something wrong with that file, or maybe the columns not shown here will shed more light. Do you happen to have a transcript ID column that you're not looking at, perhaps?