Human Exome Capture Library Coordinates Download
5
14
Entering edit mode
13.8 years ago
Biomed 5.0k

Does any one know if and where I can download the files that contain the coordinates that Agilent Sure Select Exome capture kits target? I am interested in both the older version and the newer 50 MB version. It would also be nice to have TruSeq coordinates to but the Agilent is more important now. Thanks

exome agilent next-gen sequencing • 31k views
ADD COMMENT
0
Entering edit mode

biomed: Did you manage to generate / download a file corresponding to Agilent Sure Select Exome 50 MB version ? I am looking for the same here.

ADD REPLY
0
Entering edit mode

Yes, but not through a publicly available site. How can I contact you to send the file?

ADD REPLY
0
Entering edit mode

Is it possible for you to send me the files corresponding to Agilent SureSelect Exome? I am looking for them or give me direction where I can get them?

ADD REPLY
16
Entering edit mode
11.5 years ago
John St. John ★ 1.2k

I was told recently by Agilent to download the data from here https://earray.chem.agilent.com/suredesign

Just create an account, log in, and select the "Find Designs" tab, then under that select "Agilent Catalog", then there should be a list of the different SureSelect related bed files and whatnot.

Also here is the info from the Agilent SureDesign help site on what the various files you will download from there actually mean:

BED files:

The three BED-format track files that SureDesign creates for each custom SureSelect design are described below. You can import these files into a compatible genome browser to graphically view the locations of the tracks in the genome. For detailed information on the tracks and how they can help you analyze your design, see Design analysis using tracks.

[design ID]_Regions.bed - This BED file contains a single track of the target regions of interest that SureDesign used to select the probes. You can use this track to see the exact regions that the program was attempting to cover when selecting the probes.

[design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

[design ID]_AllTracks.bed - This multitrack BED file includes the following tracks:

  • The Target Regions track is identical to the track in the Regions BED file.
  • The Covered probes track is identical to the track in the Covered BED file.
  • The Missed Regions track contains any regions from the Target Regions track that are not included in the Covered probes track.
  • The Probes track contains the regions of all probes in the design.

Text files:

The three text files for a custom SureSelect design are described below. You can view these files in any text editor program (e.g. NotePad) or spreadsheet program (e.g. Excel). Any tables embedded in the text files are tab-delimited and contain column headers. Lines of text that start with a # character are comment lines.

[design ID]_Targets.txt - This file contains a list of the target identifiers that you entered when creating the design.

[design ID]_Probes.txt - This file is a list of the probes in the design, with specific information about each probe, including its probe ID, sequence, genomic coordinates, and the target it is intended to capture.

Note that a probe may be listed in the Probes text file multiple times if it covers multiple targets. This can occur if the target identifiers you entered map to overlapping regions or are synonyms for the same gene (e.g. HER2 and ERBB2). Although these probes are listed multiple times in the file, they are not replicated in the design.

[design ID]_Report.txt - This file contains summary information on the design, the probes, the targets, and the parameters used to create the design.

ADD COMMENT
0
Entering edit mode

Thanks, this worked well for me.

ADD REPLY
3
Entering edit mode
13.8 years ago

A manifest would be the best, but You could always build it yourself since the data comes from: - coding exons annotated by the GENCODE project (http://www.sanger.ac.uk/gencode/) - all exons annotated in the consensus CDS (CCDS – March 2009) database as well as 10 base pairs of flanking sequence - small non-coding RNAs from miRBase (v.13) - and Rfam.

From: http://www.genomics.agilent.com/CollectionSubpage.aspx?PageType=Product&SubPageType=ProductData&PageID=2318

(the 50mb one)

You might be able to get it through eArray. eArray is Agilents microarray and targeted design tool

ADD COMMENT
0
Entering edit mode

Thanks for pointing to the Gencode project. Would you be able to point to a specific file in the ftp site ftp://ftp.sanger.ac.uk/pub/gencode to use in lieu of the Agilent Sure Select data?

ADD REPLY
0
Entering edit mode

A good starting point would be their release 5 GTF formated file: ftp://ftp.sanger.ac.uk/pub/gencode/release_5/gencode.v5.annotation.gtf.gz

ADD REPLY
3
Entering edit mode
12.7 years ago
Travis ★ 2.8k

You will need to access Agilent's earray system (earray)

If you are new to earray, you will first need to register and then it is free to use.

In the top right corner of the screen choose Application type Sureselect Target enrichment.

From there choose “Libraries” and then “Browse Libraries”.

Choose the catalogue kit that you are interested in and click download.

A list of the annotation files available will then appear, choose the file type you require and click download.

ADD COMMENT
2
Entering edit mode
13.5 years ago
Felix ▴ 90

Before you try re-creating the entire region list, please use the files at ftp://ftp.sanger.ac.uk/pub/fsk/exome/ e.g. exome_B_NCBI36.bed.

These are the regions the Agilent design was targetted at, created by merging coding regions of Havana and Ensembl genes (not just CCDS genes), adding miRNAs and adding the flanks as mentioned. If you need the positions of the final targets, please contact Agilent though.

ADD COMMENT
0
Entering edit mode

Sure that these are the targets that they were going for with their probe set? I just got off the phone with Agilent and they wouldn't tell me how to get a .bed file of the gene regions they were targeting specifically, although that would be really useful information to have. What they provide though is a .bed file of the probe coordinates they chose. Also the link you provided here appears to be broken. Also do you know which version of the Agilent sureselect kit was targeting this annotation? I am looking for info on v2.

ADD REPLY
0
Entering edit mode

Did you find the files? I am looking for the same here.

ADD REPLY
2
Entering edit mode
13.1 years ago
Kevin ▴ 640

I was told by the very helpful Agilent staff that you can login to eArray to get the bed file. I found the link once but I have since lost it again.

eArray is here but I forgot the hoops I needed to jump (after registering) to get the file https://earray.chem.agilent.com/earray/

ADD COMMENT

Login before adding your answer.

Traffic: 2880 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6