Greetings, I have just started using dedicated software to analyse NGS data and found an issue for which I haven't been able to find an answer, so I'm writing this post here hoping that someone can help me out.
Basically, the software in use reports data using hg19 as a reference for the human genome, which should correspond to the GRCh37 reference genome.The problem is that for any given variant, the locus reported is inconsistent with the position on the reference genome, and I can't understand why.
I know I'm inquiring about the same position because the coding sequence for a certain variant has the same name on both the NGS software and database used (ClinVar), but the locus indicated is different. Also I'm sure ClinVar states the "correct" genomic location because I used that datum in other situations and was given correct answers.
The question is: why does the NGS software, supposedly basing the output data on the hg19 reference, reports a wrong locus for a certain variant, in comparison with GRCg37? Thanks for reading.
Welcome. Unfortunately there's no way we can help without even knowing what the software you're using is. 'NGS software' isn't very specific. What software are you using and what arguments are you running? Can you provide an example of a locus which doesn't map to the reference genome (i.e. include the RSID and position). Provide as much specific detail as you can and you're more likely to get a good answer :) Thanks.
same sentiments exactly galaxy!
thank you for the reply. I'm using thermofisher's ion torrent oncomine kit to assess the presence of cancer-related variants in probands' genome. I use ionreporter for the analysis and take advantage of a preset workflow dedicated to the oncomine kit itself. Data is reviewed through ion reporter's interface, on the analyses tab.
I clicked on the proband's data of interest and displayed the relative information. For example, IonReporter states there's a mutation in the locus chr17:7578368, which corresponds to exon 5 of TP53 gene (according to the software's analysis) and the mutation is c.524G>A (p.Arg175His)(rs28934578). If i check for this mutation in ClinVar (even just by clicking on ClinVar's link in the "variant details" window) the above-said mutation is located in a different position, namely 17: 7578406 (GRCh37). As said before, IonReporter states it's using the hg19 version of the human genome assembly.
I entered the locus datum and mutation from both the sources into the FATHMM website's tool cScape and obtained different results, as expected. When IonReporter's locus was used the tool indicated an intronic mutation, when ClinVar's locus was used the tool indicated an exonic mutation so I assumed ClinVar's locus was reliable. I'm really not a bioinformaticist myself I use sequencing as a diagnostic tool and need a very few information among all the data output from a sequencing run but I can't help asking myself what's wrong here, in other words I apologise if I'm making a mess out of nothing.
incorrect how. provide examples.
is each variant off by one? did you check if coordinates returned correspond to another build (e.g. hg38?)
this is probably a really quick fix but witout knowing more and having some saample data to look at its hard to troubleshoot this in the abstract