Question

RNA-Seq read alignment genomic and transcriptomic

0

Entering edit mode

4.1 years ago

imaparna27 ▴ 20

Hi, I am trying to align my RNA-seq datasets to the human reference genome and transcriptome. However, I am not sure if it is needed and if yes, what is the purpose of --exon parameter in HISAT2 command for index building. Also, how can I build index for human transcriptome (considering cDNA for index build) and align my reads to it, using HISAT2?

Any help would be appreciated. Thanks.

RNA-Seq alignment hisat2 • 1.2k views

ADD COMMENT • link 4.0 years ago by imaparna27 ▴ 20

1

Entering edit mode

Hi, i have not used HISAT2, but i aligned my data using STAR. it's RNASeq aligner but it's required 16GB of RAM or above. first step is generating Genome Index "assuming it's Human data download Reference data with it's annotation and i highly recommend you to use GeneCod database 'https://www.gencodegenes.org"

STAR --runThreadN 8 --runMode genomeGenerate --genomeDir 'pathToYourGenome' --genomeFastaFiles 'PathToYourFastaFile' --sjdbGTFfiles 'PathToYourGTFfile' --sjdbOverhang 100

*genomeDir= is where to placed indexed genome *sjdbOverhang= specified the length of the genomics sequenced around annotated junctions for construction the splicing junction database it's correspond to the read length of you RNASeq data (value of 100 will usually 'Not allows' work with most cases) * this process will take around 30 Min. "for me" and it's dependent on machine specifications

second step Mapping: *For PE:

STAR --runThreadN 8 --runMode alignReads --genomeDir 'pathToYourIndexedGenome' --readFilesIn 'pathToYourReadR1&R2' --outSAMtype BAM SortedByCoordinate

*For advance option see STAR manual 'https://physiology.med.cornell.edu/faculty/skrabanek/lab/angsd/lecture_notes/STARmanual.pdf'

Now you are done with STAR and you can proceed with GATK for Alignment post processing and variant calling, i highly recommend you to follow GATK best practices 'https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels-'

i hope it's helpful!

ADD REPLY • link 4.1 years ago by a.alnawfal.1992 ▴ 360

0

Entering edit mode

@a.alnawfal.1992 Thanks, you suggestion has been so much helpful. Now, I am doing my hands-on on STAR.

However, I was able solve it through Hisat2 as well. I just downloaded the cDNA fasta files from Ensembl and built the index using hisat2-build (same as genome index building). Transcriptome assembly before quantification is an important step before quantification, when differential expression is the goal.

ADD REPLY • link 4.0 years ago by imaparna27 ▴ 20