I am planning a trancriptome alignment of iCLIP sequencing data. How can I link UCSC hg19 transcript ids with transcript features e.g. coding, non-coding, lncRNA, antisense, pseudogene ...?
I downloaded iGenome's UCSC hg19 reference genome and used the genes.gtf file available with the download to prepare reference sequences for RSEM:
rsem-prepare-reference --gtf genes.gtf --bowtie2 genome.fa ../RSEM_bowtie2/genome
The above command genererates a list of files, one of them being genome.transcripts.fa, a fasta file containing 51398 sequences, one for each transcript (NM_130786, NR_015380, NM_001198818 ...) as defined in the genes.gtf file.
Once I perform a transcriptome alignment using rsem-calculate-expression, how can I then link each transcript id with transcript features such as the ones mentioned above?
Any ideas would be helpful.