Bulk RNAseq Salmon index building which transcriptome to use
0
0
Entering edit mode
12 months ago
Orange ▴ 30

Hi all, I am new to the platform.

I was wondering what the common/best practice is regarding building a Salmon index for bulk RNAseq analysis of human cells.

The tutorial for Salmon/Alevin is using the complete transcriptome from GENCODE (gencode.vM23.transcripts.fa.gz, can be seen from the lack of "pc_" in the file name) which presumably contains rRNA sequences.

However, other tutorials use a transcriptome from Ensembl which seems to contain only the cDNA (Homo_sapiens.GRCh38.cdna.all.fa.gz).

My research question is rather simple and looking for DEGs between treatments so I am not sure if I will be focusing on differences in ncRNAs. Also, "overrepresented sequences" and "per base GC content" from FastQC is consistent with a small rRNA contamination (I am currently running Salmon to see if the rRNA levels are similar across all samples).

I wonder which of the following is the common/best practice or if there is a better approach:

  1. Build full decoy aware index using full transcriptome (as in Salmon/Alevin tutorial)
  2. Build full decoy aware index using Protein-coding transcript sequences (GENCODE) or cDNA (Ensembl)
  3. Build full decoy aware index using full transcriptome, except removing rRNA sequences from the transcriptome and moving them to decoy sequences
  4. Similar to #1, except preceded by rRNA removal with BBDuk or SortMeRNA

Thanks in advance for your help!

Salmon RNA-seq • 982 views
ADD COMMENT
0
Entering edit mode

Generally speaking you'll want to provide the full transcriptome plus the genome decoy as per your point 1.

ADD REPLY
1
Entering edit mode

Thanks for your reply! I proceeded with #1. After quantification, I looked at the reads that mapped to rRNA, rRNA-pseudogene, MT-rRNA and they were all minimal (<0.5% of total TPM of 1e6). Although there were slight differences amongst treatments, I don't think this will be an issue.

ADD REPLY

Login before adding your answer.

Traffic: 1850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6