Best Transcriptome file for Salmon after STAR alignment
0
0
Entering edit mode
4 days ago

Hi everyone !

I'm analyzing bulk RNAseq paired-end, this is my workflow for now:

  • fastp for QC and trimming
  • STAR for alignment to the genome (with --quantMode TranscriptomeSAM)
  • samtools to sort by coordinates and index the transcriptome.bam file generated by STAR
  • umi tools to deduplicate the umi
  • samtools collate to randomize the reads for salmon
  • Salmon to quantify

My question was about the transcriptome.fa file that I should give to Salmon as I mapped with STAR to the genome. Should I use the one from cDNA on Ensembl ? Or should I use gffread on the same genome fasta I used for my Star alignment and then use this generated transcriptome fasta for salmon ?

Thanks in advance !

fasta salmon star rna-seq alignment • 185 views
ADD COMMENT
0
Entering edit mode

It seems that it might be a little bit troublesome to go down the route you want to go trough. From the Salmon documentation:

Genomic vs. Transcriptomic alignments

Salmon expects that the alignment files provided are with respect to the transcripts given in the corresponding FASTA file. That is, Salmon expects that the reads have been aligned directly to the transcriptome (like RSEM, eXpress, etc.) rather than to the genome (as does, e.g. Cufflinks). If you have reads that have already been aligned to the genome, there are currently 3 options for converting them for use with Salmon. First, you could convert the SAM/BAM file to a FAST{A/Q} file and then use the lightweight-alignment-based mode of Salmon described below. Second, given the converted FASTA{A/Q} file, you could re-align these converted reads directly to the transcripts with your favorite aligner and run Salmon in alignment-based mode as described above. Third, you could use a tool like sam-xlate to try and convert the genome-coordinate BAM files directly into transcript coordinates. This avoids the necessity of having to re-map the reads. However, we have very limited experience with this tool so far.

You mapped against the genome so they give you three options or either remap the whole thing to a transcriptome fasta.

ADD REPLY

Login before adding your answer.

Traffic: 1925 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6