Advice on downstream analisys: data from RNA-Seq
0
0
Entering edit mode
4.6 years ago
marcelolaia ▴ 10

My scenario:

I run featureCounts in two way:

  1. Approach a: featureCounts -p -B -a Specie.transcript.fa.gtf -t exon -g gene_id -o A1.counts.txt -f A1.bbduk.bam
  2. Approach b: featureCounts -p -B -a Specie.gene_exons.gtf -t exon -g transcript_id -o A1.counts_transcript_id.txt -f A1.bbduk.bam

From 'a', I obtained a list of genes differentially expressed (GDE) by NOIseq package - 1,714 genes. From 'b', I obtained a list of 3,067 exons DE.

I submitted that two lists to Blast2GO program and got Blastx, Interpro and EC for almost all sequences in each lists.

I have downloaded the GeneSCF and I will try it, too.

From here, I need help.

I would like to conduct a more refined analysis of this data. I tried to do a heatmap (pheatmap package in R), but, the huge amount of data shows up an unintelligible graphic. So, I did a subset of the data based on M value (NOISeq foldchange) >(+-)X (absolute value of X) and got a 84 DE exons/genes suitable for a plot. However, I see that plot and it like isn't a good idea doing a subset on data in this manner.

Have you ever been in a situation like this? Large amount of data? How did you do to extract the best biological information from them?

Any suggestion/advice is very welcome!

I'm a Debian user from Potato to now, but, I am not a programmer.

If this is a off topic, please, don't hesitate to tell me. I delete the post immediately.

Best

differentially-expressed-genes RNA-Seq • 810 views
ADD COMMENT

Login before adding your answer.

Traffic: 1304 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6