Use LiftOver without knowing reference Fasta of used vcf file
1
1
Entering edit mode
4 months ago
Lukas ▴ 130

Hi,

I find out that my vcf file have only name of reference as hg38 and it was done for gene panel study.

So I used annotated vcr file with snpEff and this reference sequence. However I noticed quite a difference between my data sequences and general databases like dbSNP, NCBI gene and Uniprot unablig me get meaningful information from the variations outside the annotated vcf file.

So I would like you to ask about some programs/procedures which would able me get important information from genetic data and not just make guesses?

I found out that it would be useful to get reanotation of that file with LiftOver from UCSC browser. However I am not sure if it's ok to use it for mainly SNP data and is it possible to use without known reference sequence and have at least meaningful information close enough for reanotated data?

vcf • 545 views
ADD COMMENT
0
Entering edit mode

I am quite desperate, because I finally realised that vcf data from 12.2.2021 cannot be considered valid in 2024. I feel stuck and without ideas. I would be thankful for any help.

ADD REPLY
0
Entering edit mode

because I finally realised that vcf data from 12.2.2021 cannot be considered valid in 2024

Can you explain why? GRCh38 was originally released in June 2014. Since that time there are only patch releases that do not change the chromosome coordinates.

ADD REPLY
0
Entering edit mode

There are multiple changes released even with patches in my data - mainly changes in interpretation, but there were a few move on by a few bp. By using LiftOver it seemed more scientific robust method and stable. Because I technically don't know the reference, I am not sure what to use and the cross reference can be significantly different between each other. But if using cross reference is regular way of doing it, I am going to do it using .p14 and going to thank you.

ADD REPLY
1
Entering edit mode

Your original post seemed to indicate that you only had the VCF but if you do have original fastq or even the aligned BAM file (instead of fastq) then realigning the data to a reference/annotation package would be the safest option as noted below.

AFAIK LiftOver only allows moves between major genome releases (not patches).

ADD REPLY
2
Entering edit mode
4 months ago

If you are really unsure you can remap your reads from the FASTQs onwards and re-call the data. You can also validate the VCF against the FASTA genome sequence you believed was used to create it - just check the reference base in your VCF against the actual FASTA base at that position in the reference.

I would re-align if I were you to be sure. Ideally you should be able to contact whoever made the VCF and get the exact same ref used.

ADD COMMENT
0
Entering edit mode

Thank you. I am going to try that.

ADD REPLY

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6