Calculate pairwise LD for two given genomic loci (not rsIDs)?
1
0
Entering edit mode
7.9 years ago
burcakotlu ▴ 40

Hi to all,

I would like to calculate pairwise LD for two given genomic locis (not rsIDs)? Is it possible?

Thanks, Burçak

pairwise LD two given genomic loci • 2.2k views
ADD COMMENT
1
Entering edit mode
7.9 years ago

If you don't have any data, you could download genotypes from the 1000 Genomes Project with tabix (http://www.internationalgenome.org/category/tabix/) and then use Haploview to calculate the LD.

ADD COMMENT
0
Entering edit mode

Thank you for your reply.

I have read that vcftools calculates pairwise LD through the arguments below from this website (https://vcftools.github.io/documentation.html#ld).

./vcftools --vcf input_data.vcf --hap-r2 --ld-window-bp 50000 --out ld_window_50000

If I get it right, input_data.vcf contains the genomic coordinates of interest. But for which population does it calculates LD? And do I need to download any data so that vcftools will utilize it during LD calculation?

I could not understand this part.

Any idea?

Burçak

ADD REPLY
1
Entering edit mode

Example

vcftools --gzvcf ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --chr 5 --from-bp 1000000 --to-bp 1100000 --out chr5_analysis --keep Samples.txt --hap-r2

input_data.vcf

You can use vcf files from 1000 Genomes Project (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/). However, these files have all the subjects from 1000. You need to create a list of samples you want to use.

Samples

Choose your samples from the file integrated_call_samples_v3.20130502.ALL.panel available with the data.

Samples.txt

NA06984

NA06985

NA06986

NA06989

ADD REPLY
0
Entering edit mode

Do you mean that vcftools will use "ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" file in order to calculate LD and vcftools will calculate LD for all genomic coordinate pairs in --chr 5 --from-bp 1000000 --to-bp 1100000?

If yes, I want to provide genomic positions in a file instead of "--chr 5 --from-bp 1000000 --to-bp 1100000"?

And is there a way to calculate LD using output of WES data coming from parents and a child?

Or does it have to consist of a lot of samples?

Thanks, Burçak

ADD REPLY
1
Entering edit mode

1- Input file : If you have your own data (which you should always specify when you ask a question), replace "ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" with your vcf file.

2- LD : Yes, vcftools will calculate the LD for all coordinate pairs in the region. If you want the LD between two variants, create a vcf file with only those variants and it will work. The --chr, --from-bp and --to-bp options can be removed.

3- Parents and Child : No idea

ADD REPLY
0
Entering edit mode

Yes my main question right now is "Can we give parents and their child data in vcf format as input file to vcftools and calculate LD for various genomic loci using this input file?"

Thank you, Burçak

ADD REPLY

Login before adding your answer.

Traffic: 1708 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6