Create BED- or GVCF-type file for human genome
0
0
Entering edit mode
3.4 years ago
ariel ▴ 250

I'm trying to create something like a GVCF file for the human genome, but without the variants. It would be like:

CHROM POS BASE

I figured one way to start would be to align hg19 to itself to get a SAM file. But not sure where to go from there.

I'm guessing there might be a GVCF file out there somewhere from some project like COSMIC or 1000 Genomes. But since those are focused on variants, will they have ALL bases included?

GRCh37 FASTA GVCF VCF hg19 • 1.2k views
ADD COMMENT
0
Entering edit mode

This looks like an XY problem. May I ask why you're creating this file?

ADD REPLY
0
Entering edit mode

It is highly possible.

I have some VCFs for patients which were made from multiple panels (XGen, Illumina). These are VCF and not GVCF, so I only have the reference bases at coordinates where variants were found.

I calculated the GC ratio for the reference bases and alternate bases.

I'd also like to calculate the GC ratio for the entire regions covered by the panels (I have the bedfiles). Since the VCF files do not have the bases for all positions, I want to make a list using the reference genome.

ADD REPLY
0
Entering edit mode

So you'd like GC ratio for the reference genome, correct? That should already be available as a resource IMO. If not, google "calculate GC ratio hg38" or something like that.

ADD REPLY
0
Entering edit mode

Sounds like a job for GATK, check out mutect2

ADD REPLY
0
Entering edit mode

Please elaborate on your "answer" - for the moment, it is just a comment as it does not really answer the top level question.

ADD REPLY

Login before adding your answer.

Traffic: 1812 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6