Hi
I am collecting RNA-Seq data from NCBI SRA for a specific organism, in order to create a de-novo transcriptome assembly.
For each data set archived in SRA, the "Library Selection" field is given, usually indicating either "cDNA", "EST" or "RANDOM". According to this page, the definitions are:
EST - Single pass sequencing of cDNA templates
cDNA - complementary DNA
RANDOM - Random selection by shearing or other method
While I think I understand the basic idea behind each strategy, it is a bit confusing to me and I'm not sure I understand the difference between sequencing cDNA and EST libraries, in NGS context, and how is that different from the "random" approach. Can someone explain?
Bottom line - I'm trying to understand what types of data are valid for de-novo transcriptome assembly.
Your advice please.
Thank you!