RNAseq project ideas for young intern with MISO?
2
1
Entering edit mode
9.3 years ago
cindyy101 ▴ 10

Hi I'm a young high school intern doing working with RNAseq this summer. These few weeks I've been looking at a specific gene in neuronal, iPS-differentiated cells and finding differentially expressed isoforms. I also looked at a specific SNP-affected exon in one isoform. Now that I finished that I have around 2 weeks to just explore with MISO, look at other interesting genes, etc.

I would like to do some genome wide analyses as well, like finding what percent of alternative splicing events are differentially expressed (I was planning on doing percent of genes differentially expressed, but that's not possible with MISO's exon-centric approach to filtering events). What are some other interesting things I can do with the data output I have? I was also provided with a huge list of genes that are differentially expressed between neuronal development stages with FDR and fold-change values, but I'm not sure what I can do with that.

Thanks.

RNA-Seq MISO SNP sashimi internship • 2.6k views
ADD COMMENT
0
Entering edit mode
9.3 years ago
Kamil ★ 2.3k

To start understanding the huge list of differentially expressed genes, try performing a gene set enrichment analysis. This is a very common type of analysis for studies that discover large lists of genes. I think your time might be well-spent learning how genes are grouped into molecular pathways and how to discover those pathways.

You might want to try running and also reading about GOrilla. It uses a clever statistical method to get quick and accurate results, and the paper cites many other commonly used tools that you might want to read about such as GSEA.

ADD COMMENT
0
Entering edit mode
9.3 years ago

Using your list of genes DE at different stages (and pretending it's 100% correct) you can see which genes are expected to be higher or lower at a given time-point, and thereby age-date your samples. Translating gene data to exon data is the bioinformatic challenge.

Another fun project:

Try some statistical hypothesis falsifying. You said you looked at a SNP affected exon in one isoform. I suppose that means correlating expression with the genotype. Biology is full of assumptions and they can lead to errors.

Hoping you have the SNP-to-correlation part automated in some way, you would be able to try the same analysis at other exons and with other SNPs. Since the other SNPs are not expected to be associated, but will still show some kind of correlation, what does that mean for the false-positive rate? Could a random SNP appear to be correlated? This is a particular problem when looking at samples with few replicates and high cost like iPS differentiated neurons. If you dont have other SNP data, just make some up. Then its guaranteed to be unrelated and my point is easier to prove. Any arbitrary SNP can show correlation to exon usage, so take care of the statistical noise.

ADD COMMENT

Login before adding your answer.

Traffic: 1395 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6