Hi, I have previously used STAR for RNASeq and wanted to try using salmon. I find that the mapping rate is rather low, ~40% for every sample. This total RNA has been extracted from whole blood and rRNA depleted. Is the 40% mapping rate due to the rRNA depletion? I have tried lower the k and using different reference file but it doesn't seem to change the mapping rate much. Fastqc of the data looks fine.
I have some of my code below,
TRANSCRIPTOME=$RESOURCEDIR/gencode.v46.transcripts.fa.gz
GENOME=$RESOURCEDIR/GRCh38.p14.genome.fa.gz
echo "Creating decoy file"
grep "^>" <(gunzip -c $GENOME) | cut -d " " -f 1 > decoys_gencode.txt
sed -i.bak -e 's/>//g' decoys_gencode.txt
echo "Concatenating transcriptome and genome"
cat $TRANSCRIPTOME $GENOME > gentrome_gencode.fa.gz
Index the reference fasta
echo "Indexing"
salmon index -t gentrome_gencode.fa.gz -d decoys_gencode.txt -k 31 -p 12 -i salmon_index_gencode --gencode
Alignment
for fn in *_trimmed_R1_001.fastq.gz;
do
SAMPLE=`basename ${fn} | sed 's/_trimmed_R1_001.fastq.gz//g'`
INDEX=$RESOURCEDIR/salmon_index_gencode/
echo "Processing sample ${SAMPLE}"
salmon quant --index $INDEX --libType A \
-1 ${SAMPLE}_trimmed_R1_001.fastq.gz \
-2 ${SAMPLE}_trimmed_R2_001.fastq.gz \
--threads 12 --validateMappings --output quants/${SAMPLE}_quant
done
If anyone could share some insight and whether or not I can just proceed to DESEQ2 that would be greatly appreciated.