Hello,
I was wondering if anyone knows how and where I could obtain the clinical data for the cell lines from the Cancer Cell Line Encyclopedia. Also, does anyone know if they label the race of the patients?
Thank you for answering my questions.
Hello,
I was wondering if anyone knows how and where I could obtain the clinical data for the cell lines from the Cancer Cell Line Encyclopedia. Also, does anyone know if they label the race of the patients?
Thank you for answering my questions.
If for the clinical information, you mean the cancer type than you can do this using the Cellosaurus. You can parse the XML version available at ftp://ftp.expasy.org/databases/cellosaurus and look for cell lines in CCLE which you can do be either taking all cell lines with:
<comment category="Part of"> Cancer Cell Line Encyclopedia (CCLE) project </comment>
or with an Xref to the CCLE as shown in this example:
</xref>
<xref database="CCLE" category="Cell line databases/resources" accession="1321N1_CENTRAL_NERVOUS_SYSTEM">
<url>https://portals.broadinstitute.org/ccle/page?cell_line=1321N1_CENTRAL_NERVOUS_SYSTEM]]></url>
</xref>
and then you can get the cancer type using the disease list which is linked to the NCI Thesaurus disease ontology as in the example below:
<disease-list>
<cv-term terminology="NCIt" accession="C60781">Astrocytoma</cv-term>
</disease-list>
In term of "race", the next release of the Cellosaurus (release 30 in May 2019) will contain a new "section" called "Genome ancestry" which will contain the computed genome ancestry information from the ECLA resource which just became available:
Figure 2 of this paper shows some ancestry predictions (for cell lines, which includes CCLE):
http://cancerres.aacrjournals.org/content/79/7/1263.figures-only
It doesn't look like they had any individuals with ambiguous ancestry (which seems odd to me), but I don't doubt the over-representation of European ancestry individuals.
However, I was also a little confused because it seems like the predicted ancestries are missing from Table S2 and/or I couldn't find an "Interactive" web-interface for the results. Perhaps somebody else could find what I may have over-looked?
Also, based upon a tweet response, I believe this is the web-interface: http://ecla.moffitt.org/#/home
We did assign ancestry categories for discussing distributions of the cell line collections. However, we chose not to include that information in the supplementary data and present only ancestry proportions. At an individual level, these categories may may contradict self perception so we prefered to avoid presenting this information. However, if you have questions specific to a particular cell line (s), please do not hesitate to contact me.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Yes table S2 only contains the cell line names, description and Xref to the Cellosaurus. The genome ancestry values are not in these files. I got the file I am using to plug the info in the Cellosaurus from Julie Dutil directly. You should contact her. Alternatively you will be able to programatically get them from the Cellosaurus XML at the next release.
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.This comment should have gone under @Charles' answer.
genomax - Thank you for noting this (and being able to respond before I could): I will create a link to this as a comment under my answer.
Amos - I would also say creating another comment under my answer (and deleting this comment) is an acceptable solution :)