I am new to NGS, I have transcriptome data (Novaseq) of two conditions i.e control and treated plant material. Some detail of my run is:
Lane Sample Barcode PF_Clusters %_of_lane %_Perfect_barcode Yield_(Mb) %_PF_Clusters % >= Q30_bases Mean_Quality
2 T_1 ATGTCA 12,24,90,463 6.88 97.38 24,743 100 87.83 34.76
2 N_2 CCGTCC 3,66,61,309 2.06 97.55 7,406 100 90.09 35.24
The data we got is not equal, i.e. T_1 (Treated) is having 24.7 GB data while N_2 (Normal) is having 7.4 GB of data which is 3:1. I need to normalize this data.
Thank you
What is the aim of the study? Differential expression, detection of isoforms, mutations, transcriptome assembly?
Differential gene analysis and detection of isoforms.
Purely as an academic interest: you may want to down-sample (not normalize) the data in this case so the dataset become equivalent. You can only downsample the larger dataset.
You could do so using
reformat.sh
from BBMap suite (look at sampling options) orseqtk sample
.