Question

Alignment with multiple reference genome using HISAT2

0

Entering edit mode

4.6 years ago

sp29 ▴ 50

I have 3 reference genomes and I wish to align my FASTQ reads against all 3 of them. I have used hisat2-build to build individual indexes of all 3 of them, but couldn't find the command to make an index of multiple genomes.

I have run the following command for alignment -

hisat2 -p 4 individual_index --dta --rna-strandness RF -1 paired_1.fastq -2 paired_2.fastq -S aligned.sam

I want to run alignment with all 3 indexes in one go with HISAT2. Also, I cannot use STAR as I am using an 8 gig ram system.

RNA-Seq Hisat2 alignment assembly next-gen • 2.2k views

ADD COMMENT • link 4.6 years ago by sp29 ▴ 50

score 1 · Answer 1 · 2020-05-08

1

Entering edit mode

4.6 years ago

GenoMax 147k

I want to run alignment with all 3 indexes in one go with HISAT2

I don't think you can do that. You could cat the three genomes and make a single giant reference to index but then you may run into the 8G RAM limit on your hardware. This may be one of those instances where finding better hardware is the answer for your requirement.

Note: You could use bbsplit.sh from BBMap suite to align against multiple genomes at the same time but depending on genomes you have 8G may not be enough.

ADD COMMENT • link 4.6 years ago by GenoMax 147k

0

Entering edit mode

well, I think it will work, as all the 3 genomes are viral so they would take less space. I will give it a try and update!

ADD REPLY • link 4.6 years ago by sp29 ▴ 50

0

Entering edit mode

Viral genomes should be no problem. Look into bbsplit.sh since it has some nice options about how you want to handle reads that multi-map, within and across these genomes. They are going to be much better than hisat2.

ADD REPLY • link 4.6 years ago by GenoMax 147k

0

Entering edit mode

I am actually new to NGS analysis, and that "cat technique worked well for me. I wanna know that while extracting the read counts from the .bam files how should I proceed with the ht-seq count? 1) Should I extract read counts using 3 different GFF/GTF files (3 for 3 genomes as used in alignment) and then merge them? 2) Or should I just append all 3 GTF/GFF into one large file and then proceed?

ADD REPLY • link 4.5 years ago by sp29 ▴ 50

1

Entering edit mode

Assuming your fasta headers are unique you may be able to use 3 passes of counting with three GTF files. Try it out first.

ADD REPLY • link 4.5 years ago by GenoMax 147k