CNV frequency database for review
0
1
Entering edit mode
10.0 years ago
J.F.Jiang ▴ 930

Hi, all,

In my study, I want to get a list of known CNV list which contain the frequency information and general annotation results.

The DGV data may the only one that I can find.

However, I still have several questions regarding this data:

  1. It seems that the DGV data is compiled from multiple sources, which is mainly based on the 1000G projects, so I assumed that this DGV data is a kind of common CNVs?
  2. There are three datasets offered in the web, which one should I use? The web claimed that the filtered one is the data that was manually curated? So I suppose this should be the best one?
  3. The data has provided the frequency information of the CNV with accessionID, is there any other information I can get that is at population level since the CNV should be population distinguishable.

If you have information, plz kindly share with me. Thanks!

Best

frequency CNV • 4.5k views
ADD COMMENT
0
Entering edit mode

1. It seems that the DGV data is compiled from multiple sources, which is mainly based on the 1000G projects, so I assumed that this DGV data is a kind of common CNVs?

Yes, there is the 1000G data included but many more. Check the statistic section on the website. And as a rule of thumb, the more studies found a CNV in a target region it should be a common one.

2. There are three datasets offered in the web, which one should I use? The web claimed that the filtered one is the data that was manually curated? So I suppose this should be the best one?

Can you attach links. I'm not sure what you mean.

3. The data has provided the frequency information of the CNV with accessionID, is there any other information I can get that is at population level since the CNV should be population distinguishable.

In DGV, some studies included such information, but some didn't. For the studies using HapMap samples you can calculate the frequencies on your own.

ADD REPLY
0
Entering edit mode

Thanks,

In the DGV database, the collected CNVs are mainly from some GWAS data profiled from the SNP array besides of 1000G studies.

For question 2, here is the link, three categories, the DGV variants, the Supporting Variants, the filtered Variants, were offered.

ADD REPLY
0
Entering edit mode

From DGV: "Supporting Variants" section, are the sample level and supporting variants that are displayed in our Supporting Variants track

This means, the file includes all samples which were used in the included studies and thier individual CNVs.

From DGV: DGV variants represents the data that is displayed in our primary DGV structural variants track.

CNVs per studies ("study level"). Nevertheless in how many samples a CNV was found, there are combined into one variant.

Filtered variants: These are variants that have been removed from the database following our curation process

Simply bad quality CNVs.

Additionally, you can display the supporting and the DGV variants ind the browser. Perhaps this helps to understand the files. What file you use is depending on your goal you want to achieve.

ADD REPLY

Login before adding your answer.

Traffic: 2032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6