Are there any available bioconductor packages for comparing RNA-seq and/or DNA methylation data within two species?
Are there any available bioconductor packages for comparing RNA-seq and/or DNA methylation data within two species?
There are a few different things that could be looked at. Firstly, assuming you ran control samples from the dogs in addition to the cancer samples, the first thing to do would be to perform standard differential expression/methylation analysis. For DE, the edgeR, DESeq2 and limma packages are very good and what you'll find everyone recommending. Note that I'm not sure how good the annotations are for the dog genome (I don't work on it), so you might need to use something like RSEM (or trinity followed by RSEM) to get decent metrics, which means you'd be stuck with limma downstream (not that that's a bad thing, limma is an extremely powerful tool). For methylation, it depends on how you generated the data. For RRBS or similar datasets, BiSeq is OK. For methylation arrays, you can use packages like minfi or COHCAP.
One of the interesting things I would do is use GSEA to compare enrichment of groups of differentially expressed/methylated genes between the canine model and patients. You'll obviously need control patient data for this to be worthwhile. If you find any highly relevant pathways (there are a few bioconductor packages for pathway analysis, though I think the Ingenuity Pathway Analysis commercial package is still better in this regard) then I'd pay particular attention to how key players in them are affected in patients.
That's a quick idea and a handful of Bioconductor packages to get you started. I could probably come up with things to look at all day, you have a really target-rich project :)
When people compare methylation in two species they usually use liftover tool to transform one species coordinates to other and then compare. Using that approach I could just do a spearman rank correlation of those DMRs. Is there a better way to do something similar. As suggested below to get Phast conservation scores. How do people generally use the Phast conservation scores?
A rank correlation could work too, though I suspect you'll get more informative results by looking at subsets. This method would also only allow looking at two samples at a time, which will get annoying quickly. The benefit of looking at conservation scores is that changes in highly conserved regions are much more likely to be biologically significant (the Encode consortium got rightly criticized for not doing this).
Not Sure about the bioconductor package. But the way, I would do is.
HTH
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Define "comparing". I can think of a few different ways of comparing such datasets and it's quite possible that none of them are what you have in mind. Try telling us what your actual biological goal is and then you'll probably get some more useful advice.
What I meant to say is comparing orthologus regions with each other. The actual biological goal is to compare a cancer in canines with Humans for a particular tissue and find similar patterns.
What other ways of comparing did you have in mind for going about it?
Without biological context you could have just wanted general comparisons between methylation levels in the promoters of various gene classes and a comparison of tpm distributions (or something similar). That's why we usually ask for the experimental context within which you want to do something. I'll give some actual suggestions in an answer below.