So, I'm having a little bit of trouble with some subjects.
The first is understanding the importance of aligning the RNA-seq data to its reference genome. I know it can tell me similarities and differences between them. But what does it actually mean?
Second, what the assembly of the transcripts can tell me?
Third, what's the importance of Orthology analysis?
Fourth, gene ontology and metabolic pathway enrichment, why should we analise it?
And the last one is what Hierarchical clustering do?
I'm just starting with Bioinformatics, and I'm trying to understand it better.
I don't think this forum is meant to educate people on 5 general questions, all asked in the same post. It would take more than 10 minutes for most people to properly answer what you are asking, and that may be too much. Instead, you can search this site using the keywords from your questions, or even google them.
I'll try to get things going by answering the first two questions, and maybe someone else will pick up the rest.
We can think of genes in DNA as a collection of recipes. Sequencing genomic DNA tells us about the total inventory of genes within a given organism. Just like most people don't make every single dish from the recipe book, not all the genes are expressed at any given time. To answer your second question, assembling RNA transcript tells us which genes were ON (transcribed, and therefore sequenced when we do RNA-seq) and which ones were OFF (not transcribed, and therefore are not found among RNA-seq contigs). However, the RNA-seq assembly doesn't tell us the expression level of each of the genes. We can found out about that by mapping individual reads to the reference genome, which answers your first question. Generally speaking, the more reads are mapped to a given DNA region, the more expressed a particular gene is.