Question

rna-seq expression analysis among several nonmodel species (Next step?)

0

Entering edit mode

6.0 years ago

cmpolania • 0

I'm currently analyzing RNA-seq data from four species in one genus, and I would love a little help with deciding my next steps.

My eventual goal: Finding secreted proteins/secondary metabolites expressed significantly among 4 species of fungus in culture: either expressed in one species only, or co-expressed in all 4. (This is a discovery-based project, there's no null hypothesis)

Starting data: I started with RNAseq reads, assembled genomes, a .gtf annotation for each genome, and functional annotation information (swissprot, signalp, PFAM, etc) for each genome. The functional annotation files hold protein_ids and corresponding descriptions.

What I have done so far: I've aligned the reads from each species to their respective genomes (including the .gtf annotations in order to keep gene_ids constant) using Hisat2, and assembled transcripts and quantified expression using Stringtie.

What I have now: 1 Stringtie output for each species, each with aligned gene_id, transcript_id, and FPKM/TPM values.

The advice I need: What should be my next step? Since I'm not looking for differential expression, I'm assuming that my next analyses should be on individual species. How can I associate my protein_ids and my gene_ids? How can I go from FPKM values to deciding whether or not a gene is significantly expressed in a species? Are FPKM values enough, or is there some kind of normalization that should still be done (log transformation)? Should gene clusters be found, and how would that be important? Once I find (for example) a gene that produces an interesting secondary metabolite, how would I find if there are analogs in the other species?

I'm feeling a little lost when it comes to what to do next.

RNA-Seq alignment • 1.2k views

ADD COMMENT • link updated 6.0 years ago by Hussain Ather ▴ 990 • written 6.0 years ago by cmpolania • 0

score 1 · Accepted Answer · 2018-11-28

1

Entering edit mode

6.0 years ago

Hussain Ather ▴ 990

Maybe you could try making a density plot of FPKM values? There you could probably see a significant cutoff as a result of it.

ADD COMMENT • link 6.0 years ago by Hussain Ather ▴ 990