Hi,
I've three species (mouse, rat and bovine) where N=100, M=150 and O=130 viral integrations occurred respectively. The virus is not exactly the same in the three species but is very close functionally.
Each species harbor different number of chromosomes, thus different genomic organizations (gene position, etc...).
Question : How can I assess the non-randomness of viral integration using my set of 100+150+130 viral integrations.
The difficulty here is working with three different species. An idea was to remap all integration sites onto one species by looking to homologous region but not every integration sites are within region that are found in genomic regions of the other species...
Any idea?
Thanks
That was something I had also in mind. Take the closest genes up and downstream of each integration and check for enrichment of ortholog genes.
How to define the neighborhood? With GSEA you could do a comparison of orthologs ranked by distance. according to this article is depends on open chromatin though: http://www.ncbi.nlm.nih.gov/pubmed/15168737
Integration depends on open chromatin -> ok. But proliferative advantage might depend on specific integration sites or regions. Thus "old / long-life" clones might possess integration in recurrent regions.