How Does Encode Data Change Design Of Ngs Experiments?
3
3
Entering edit mode
12.2 years ago

Now that we've had a week or so to digest the ENCODE publications (nice summary here), this is a question for those groups engaged in next-gen sequencing projects for gene discovery in human disorders. Most of you have probably focused on whole exome.

What elements of the ENCODE data set are ready or near-ready to include in future experiments that capture the "exome-plus"? Are groups designing targets for some of these regions for capture? Which ones? Enhancers? Promotors? Other long-range functional elements? Or do you suspect it's more efficient to just target the whole genome, so that data can be re-analyzed as functional annotations of the non-coding regions continue to improve? Interested in your responses.

encode • 3.7k views
ADD COMMENT
5
Entering edit mode
12.2 years ago

Allow me to demonstrate

enter image description here

ADD COMMENT
0
Entering edit mode

so, are we going down from the big peak?

ADD REPLY
1
Entering edit mode

oh I think we are at the technology trigger state

ADD REPLY
0
Entering edit mode

:) Point taken. What're the values on the time axis? Minutes? Hours? Days?

ADD REPLY
0
Entering edit mode

I've noticed that even though your proposed time steps span orders of magnitude are all of lengths that a person could easily tolerate. ;-) I have no idea.

I do think that the closer to release the more of a race it is to find that low hanging fruit. I have already heard a few talks of people that are interested in reverse engineering the data to find patterns with little concern to the origins or meaning of it all. It is all binding baby!

ADD REPLY
0
Entering edit mode
12.2 years ago
JC 13k

I consider the whole genome sequencing more reliable than the exome capturing techniques, because bias and other missing factor. Besides, as you mentioned, to be able to reanalyze regions.

ADD COMMENT
0
Entering edit mode

but how would you use the ENCODE data ?

ADD REPLY
0
Entering edit mode

probably I don't, many of the sequences come from immortalized cell lines, I don't have a direct application in mind for that. For my current projects I prefer the 1000 Genomes data set.

ADD REPLY
0
Entering edit mode

Thanks, JC. Sure, whole genome seq provides more consistent coverage of the exome, but at what trade-off? In my neighborhood, WGS of 1 sample is about 4x the cost of whole exome of 1 sample, so you can exome a whole trio for less than WGS of 1 sample. Unfortunately funding influences experimental design, especially when data analysis has traditionally focused on the coding regions. My question was more about what elements of the ENCODE data can be incorporated into current analysis workflows.

ADD REPLY
0
Entering edit mode

I agree with the money limitation, but I don't think people will expand the exome capture probes right now with ENCODE data, the problem is how much can cost to design and produce specific probes for your regions, at some point, whole genome sequencing will be cheaper.

ADD REPLY
0
Entering edit mode

True, changing the target capture can be expensive. So let's modify the question -- for those who focus on exome capture, at what point will the possible incorporation of ENCODE data into analysis justify the switch to WGS?

ADD REPLY
0
Entering edit mode
12.2 years ago

Wasn't ENCODE highly permissive in what they were labeling as biologically active? I think people will have to validate that this stuff is biologically relevent. And maybe some of it will be, but probably not all of it.

As someone on another blog pointed out, the % of non-coding DNA differs widely among species. If so much of our non-coding DNA was important, how are some species getting on with so much less of it?

For instance, there are two closely related onions, and one has a genome 5x as large as the other. Does it make sense to think that one onion really has 5x more going on in its genome than another onion in the same genus?

One group took a mouse, and deleted a 1 Mb region of intergenic DNA, and the mouse was phenotypically indistinguisnable from wild-type. So if there was active stuff in that region, it wasn't doing much, at least in a lab setting.

http://www.nrcresearchpress.com/doi/abs/10.1139/g05-017

http://www.ncbi.nlm.nih.gov/pubmed/15496924

ADD COMMENT

Login before adding your answer.

Traffic: 1927 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6