Entering edit mode
4.0 years ago
imaparna27
▴
20
Hi,
I am trying to align my RNA-seq datasets to the human reference genome and transcriptome. However, I am not sure if it is needed and if yes, what is the purpose of --exon
parameter in HISAT2 command for index building.
Also, how can I build index for human transcriptome (considering cDNA for index build) and align my reads to it, using HISAT2?
Any help would be appreciated. Thanks.
Hi, i have not used HISAT2, but i aligned my data using STAR. it's RNASeq aligner but it's required 16GB of RAM or above. first step is generating Genome Index "assuming it's Human data download Reference data with it's annotation and i highly recommend you to use GeneCod database 'https://www.gencodegenes.org"
*genomeDir= is where to placed indexed genome *sjdbOverhang= specified the length of the genomics sequenced around annotated junctions for construction the splicing junction database it's correspond to the read length of you RNASeq data (value of 100 will usually 'Not allows' work with most cases) * this process will take around 30 Min. "for me" and it's dependent on machine specifications
second step Mapping: *For PE:
*For advance option see STAR manual 'https://physiology.med.cornell.edu/faculty/skrabanek/lab/angsd/lecture_notes/STARmanual.pdf'
Now you are done with STAR and you can proceed with GATK for Alignment post processing and variant calling, i highly recommend you to follow GATK best practices 'https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels-'
i hope it's helpful!
@a.alnawfal.1992 Thanks, you suggestion has been so much helpful. Now, I am doing my hands-on on STAR.
However, I was able solve it through Hisat2 as well. I just downloaded the cDNA fasta files from Ensembl and built the index using
hisat2-build
(same as genome index building). Transcriptome assembly before quantification is an important step before quantification, when differential expression is the goal.