Hi,
I am doing a course project which I was asked to analyse RNAseq data. For this analysis I picked two data sets from GEO data base. The difference of these datasets is they have different read length. Dataset A has 50 sequence length and Dataset B has 202 sequence length. I obtained these values from FastQC software.
So I would like to know;
- My aim is to evaluate differentially expressed genes. Would it be logical to compare genes in these datasets?
- Should I use other softwares to evaluate sequence length? Also, forgive my ignorance about this question but is sequence length mean read length?
Thank you for your time, Best,
Tunc.
Thank you for your response;
I am planning to use DEseq.
I will basically compare datasetA in itself and dataset in itself. Then I will compare correlation of the genes across datasets.
For ex, when we look at datasetA, X gene is overexposed. We can see the same trend of gene X in the second dataset.
Also, these datasets are biologically related.
That seems like a reasonable approach. Make sure you use DESeq2 rather than DESeq. Performing the differential expression tests independently, and looking for the intersection between the two tests on different datasets, means that you can be confident in what you're seeing.
Thank you very much for your help.