Dear all,
I have two species RNA sequencing data, species A and B. the normal analysis was almost done, I got some key genes and now I want to see the expression value of these genes, and eliminate the genes that have low expression value (these genes may not play an important role in this species development).
species A haven't had a reference genome, so I used the
1.Trinity 2.align_and_estimate_abundance.pl 3.abundance_estimates_to_matrix.pl
then I got a file called rsem-gene.isoform.TPM.not_cross_norm
, I want to know if this file can be thought the expression value for the genes in species A? if yes, if there is any cutoff value for us to say the expression value if low?
And species B had a reference genome, so I used STAR-HTSeq-DEseq2
, there is one step in DEseq2 normalized_counts <- counts(dds, normalized=TRUE)
can do the normalization, so I want to know for species B, if this is the expression value? if yes, the cutoff should be what,,,
Thanks in advance!
Hi, sorry for the unclear statement, I don't need to compare the expression value between A, B species, I just want to delete the low expression genes within each species..
Hi, that's very vague and depends on what you define as "low expression genes". For example, you can define it as the bottom X percent of genes, or genes with fewer than X Transcripts Per Kilobase Million (TPM); you define what X is.
Think about the biology of your species; do you expect many genes to be important for development or only a few genes? Maybe plot a histogram of expression values? In any case, you'll need some more information otherwise you're off just assigning arbitrary cutoffs.
Also, how many samples do you have? If you only have a single species B sample and you aren't doing between-sample comparisons, just use TPM -- no need for DESeq2.