If you navigate to MSigDB you can find many lists of genes, as you asked for. However, I would recommend that you never start with the question "where can I download gene lists." Instead, you must start with the question, "what do I want to understand about Ewing's Sarcoma?" THEN choose the gene list(s).
To your credit, you have done exactly this ... you wrote,
the word "related" is quite imprecise, and can be different ways of "relation" , so hopefully there should be some info on how [a given gene] is "related" [to Ewing's Sarcoma].
This is a crucially important insight that should dictate what gene "list" you select, what is included, and why. In fact, if we go back the MSigDB website, but this time we click on "molecular signatures database", what do we see? 9 different collections of gene lists that represent different things.
The "gene lists" one can download online may represent more or less anything. They could be targets of a transcription factor in a specific cell type. They could be lists of genes found in a certain part of a chromosome. They could reflect infection by a certain type of pathogen, in an unspecified tissue type. Further, if the results were curated from bulk RNA-seq data, such a gene set might further include irrelevant results that appeared to be part of the same signature due to the mixture of relevant and irrelevant cell types together into one sequencing experiment ...
What would the association of the list of genes in such a "gene set" to Ewing's mean? Well, it depends on the type of data you have in hand, the nature of the gene set you tested, and the nature of the association.
- For instance, if the gene set were "interferon gamma signature" and you had expression data, a downregulation might reflect Stat3 activation dependent competitive inhibition of Stat1, resulting in decreased expression of interferons.
- However, if the gene set were "genes found in 8p21", then, even if it is the same expression data and the same downregulation, you might interpret a loss of expression as evidence for a recurrent structural variant that is deleting the genes in 8p21 in a large enough percentage of your patients to detect the change in expression...
Anyway, when I was a young bioinformatician, once upon a time, I remember downloading in excess of 20,000 gene sets from GO, KEGG, WikiPathways, MSigDB, etc., so I could take an all against all approach if needed. I don't do this anymore. Truth be told, if one simply runs all pathways against a condition, probably more than half the tests are inappropriate, and another 25% are poor proxies for one reason or another.
Instead, I try to select the right gene sets to test. More often than you might expect, no good list exists, ... and i've ended up asking my collaborators to curate a gene set themselves based on their understanding of the literature, or based on scRNA seq data in a given cell type, etc...
I hope this answer helps you. I am still struggling to master how to proceed based on it after all this time ...
Yes, here it is: https://en.wikipedia.org/wiki/Ewing%27s_sarcoma#Genetics
Yep, but it is only very few. Papers contain much more, the question is about some systematic collection from the papers, or from some studies, or whatever
"Most cases of Ewing sarcoma (about 85%) are the result of a defining genetic event; a reciprocal translocation between chromosomes 11 and 22, t(11,22), which fuses the Ewing's Sarcoma Breakpoint Region 1 (EWSR1) gene of chromosome 22 (which encodes the EWS protein) to the Friend Leukemia Virus Integration 1 (FLI1) gene (which encodes Friend Leukemia Integration 1 transcription factor (FLI1), a member of the ETS transcription factor family) of chromosome 11"
Which means that almost all Ewing sarcoma arise from this gene fusion.
"Other translocations are at t(21;22) and t(7;22)." - I guess these are almost the rest.
There always will be some space for sarcomas which don't have any such translocation detectable, however, I doubt any study was large enough do discover those since a large statistical power is required to discover them.
I usually tend to start with OMIM, which provides curated information on the genetic basis of particular diseases. If you are interested in a broader view and gene lists for heatmaps, there are various websites that make the data of big cancer consortium projects accessible. The Cbio-Portal for example features two datasets (SetA, SetB) for Ewing sarcoma. To some extent, you can query and filter the data to include only the genes of interest.
Thank you that is helpful !
Sure, but the targets affected by tf EWSR1-FLI1 are not quite known.
What have you tried so far?