Entering edit mode
10.6 years ago
Y Tb
▴
230
I run cufflinks to assembly transcripts for mouse rnaseq data which is single-end read as follows:
cufflinks -G Mm_ensembl_37.gtf -o mouse_brain -p 8 mouse_brain/accepted_hits.bam
and I got a folder contains:
1) transcripts.gtf
2) isoforms.fpkm_tracking
3) genes.fpkm_tracking
4) skipped.gtf
what is the important information in those files , and how can summarize it
Hi Vivek, I mean what is the statistical information can I got from these files
The online manual explains every column of what's included in those files but you'd likely be looking at the FPKM values of genes if your idea is to look for gene expression. If you are analyzing just one sample you'd likely be looking at how much one gene in expressed in relation to the rest however if you have multiple samples with replicates, running cuffdiff might further give you gene expression changes across samples/conditions.
A lot of what to look for depends on your experimental design and analysis objectives.
Hi Vivek , I have 4 samples for mouse and other four for human and all of the are single-end and I want to compare the antisense for them. Right now I finished cufflinks analysis for all samples, and I don't know what should I do for the next step. I don't know if using cuffdiff help me or not, so if you have any idea about this I will appreciate that.
you should look for a local bio-informatician that can help you with your complete study design and data analysis
Of the 4 samples do you have case - controls? Are there replicates? If so you should be looking at using Cuffdiff to compare the cases against controls.
Hi Vivek, All of my sample for normal tissues so I don't have case and control in my study. the purpose is to compare the antisense in long non-coding RNA between human and mouse.
This is likely a bit beyond my knowledge to formulate an analysis design. I just never used rna-seq to do a cross-species comparison.