Question

Snp'S On Exome Sequencing?

0

Entering edit mode

12.1 years ago

xinhui.wang ▴ 570

I am working with NCBI2R. I have the SNP's generated bu plink from the Whole genome sequencing, however, I woule like to consider only the SNP's based on the exome sequencing. Greatly appreciated if some one could helo me.

exome sequencing snp • 4.2k views

ADD COMMENT • link 12.1 years ago by xinhui.wang ▴ 570

score 2 · Answer 1 · 2013-06-11

2

Entering edit mode

12.1 years ago

DG 7.3k

Two options: One is the answer suggested by Tky, which you be to select only those SNPs that appear in both groups. The downside would if there is a SNP that appears in one group, but no member of the other group has it. That is still a relevant comparison.

The other option would be to take the targeted intervals from your exome sequencing and extract from both groups only the SNPs that fall within those regions. There would be several options to do this, and it would all depend on how you currently have your data in terms of format.

I don't know about working with plink encoded files but if you have VCFs of your SNP calls it is fairly trivial to extract only the segments of the VCF that fall within defined regions using a combination of bedtools and perhaps IntervalTree from bxpython. I have code that will do this based on genomic regions of interest, but I have never tried it before if you have an interval for every targeted exon.

ADD COMMENT • link 12.1 years ago by DG 7.3k

0

Entering edit mode

Really thanks! The suggestions really helpful. For "select only those SNPs that appear in both groups", it seems not the best choice, since there should be some SNP's only appeared in one group and not in the other group, which also are interesting for us.

The other option would be to take the targeted intervals from your exome sequencing and extract from both groups only the SNPs that fall within those regions.

This seems the best choice. However, do you have any idea about the fast tools, which could found the region of the known SNP's belong to? Since I have more than 100 thousands SNP's.

I have VCF document, and will get more information about bedtools. Thanks a lot.

The other option would be to take the targeted intervals from your exome sequencing and extract from both groups only the SNPs that fall within those regions.

This seems the best choice. However, do you have any idea about the fasttools, which could found the region of the known SNP's belong to? Since I have more than 100 throunds SNP's.

I have VCF document, and will get more information about bedtools. Thanks a lot.

ADD REPLY • link 12.1 years ago by xinhui.wang ▴ 570

0

Entering edit mode

Probably the quickest is to write a small script that uses bedtools to parse the appropriate bed file. Use Tabix to index your VCF files and call tabix in a script to retrieve from the VCF file anything that overlaps your chromosomal regions, which will be the targeted exons.

ADD REPLY • link 12.1 years ago by DG 7.3k

score 1 · Answer 2 · 2013-06-11

1

Entering edit mode

12.1 years ago

Tky ★ 1.0k

I guess your question is how to filter out common/annotated SNPs. If that is the case, you may use ANNOVAR, check here.

I recommend that you phrase your question appropriately and pay attention to the typo errors (e.g. bu plink/helo me), perhaps you don't know, the moderators on our site are keen in closing questions :-)

ADD COMMENT • link 12.1 years ago by Tky ★ 1.0k

0

Entering edit mode

Thanks. Really sorry for the typo errors. I used a new computer, not get used to the keyboard of this computer. Probably, I did not described my question in the right way. I tried to compare the SNP different from the two groups. Unfortunately, one of these two group is from the whole genome sequencing (group a) and the another one is from the exome sequencing (group b). To make these two groups comparable, I tried to compare the SNP's both from the exome sequencing. My question is that: is there a easy way, with which I could extract the SNP's based on the exome sequencing from the result from the whole genome sequencing (the results form group a)?

ADD REPLY • link 12.1 years ago by xinhui.wang ▴ 570

1

Entering edit mode

Hi, you mean to select SNPs appeared in both groups? you should be able to do this use plink.

ADD REPLY • link 12.1 years ago by Tky ★ 1.0k

0

Entering edit mode

could I run the Plink with my PC? it is only 3GB RAM...

ADD REPLY • link 12.1 years ago by xinhui.wang ▴ 570

0

Entering edit mode

Yeah, that will be more than enough. you may get your data coded in plink binary format, and check markers in the bim file.

ADD REPLY • link 12.1 years ago by Tky ★ 1.0k

0

Entering edit mode

Thanks you very much! I will try it!

ADD REPLY • link 12.1 years ago by xinhui.wang ▴ 570

score 0 · Answer 3 · 2013-06-12

0

Entering edit mode

12.1 years ago

xinhui.wang ▴ 570

Now I am testing plink. I have two folder. one is about all the SNP's from patients and the other group are from the control group. I would like to compare these two groups. I tied plink --file mydata --assoc. However, how could I input the format of mydata?

ADD COMMENT • link 12.1 years ago by xinhui.wang ▴ 570