Hello Everyone,
I have made bam files using hisat2 and then I am trying to use these bam files for salmon alignment.
hisat2-build genome.fa hisat
hisat2 -x hisat -1 R1_paired.fastq.gz -2 R2_paired.fastq.gz -S out.sam
samtools view -bS out.sam > out.bam
java -Xms1g -Xmx3g -jar picard.jar MarkDuplicates \
I=out.bam \
O=marked_duplicates.bam \
M=metrics.txt
salmon quant -t transcripts.fasta -l A -a marked_duplicates.bam -o out.sf
But it gives an error:
Please provide a reference FASTA file that includes all targets present in the BAM header
If you have access to the genome FASTA and GTF used for alignment
consider generating a transcriptome fasta using a command like:
gffread -w output.fa -g genome.fa genome.gtf
Thank you for any help!
Thank you! Can
--tmo/--transcriptome-mapping-only
from hisat2 do the same thing as--quantMode TranscriptomeSAM
from STAR.No, this will simply prevent HISAT2 from aligning reads outside of annotated transcripts. These will still be spliced alignments in genomic coordinates. As far as I know, HISAT2 has no option to project genomic alignments to transcriptomic coordinates, so the only way to accomplish this with HISAT2 would be to align the reads directly against the transcriptome rather than the genome.
Thank you! I am using transcriptome to align the reads. The .bam files are made but I saw this error in the log file.
Are these .bam files okay to use further.
Why do you even align the data? I suggested already in a previous thread to use salmon directly on the fastq files. There is no need for alignment if you quantify with salmon.