RNA-Seq: Getting Started with Kallisto
3
1
Entering edit mode
5.4 years ago
arussell3483 ▴ 30

Hello,

I am relatively new to bioinformatics and RNA-sequencing and am working on developing a workflow for my sequencing project. I am planning to use Kallisto and Sleuth as part of the analysis, but I am not sure how to get started. Will the quality control and trimming take place before I run the fasta file through kallisto? If so, does this step also take place in the Linux terminal?

Thanks!

RNA-Seq • 9.3k views
ADD COMMENT
3
Entering edit mode
5.4 years ago

Your reads would be in FASTQ format, not FASTA, though you will possibly have a transcriptome/genome reference in FASTA format.

A typical workflow I would say is something like:

  • Start with by getting some FASTQ files, have them in separate directories per sample (this is something that's done easily with terminal/bash commands, good opportunity to get familiar with that.)
  • You may run a QC tool (like FastQC + MultiQC) on the raw data, determine if you need to apply trimming
  • Apply trimming, re-run QC tools
  • Align your reads to a reference (giving you a BAM file) using an aligner (some quantification tools like RSEM allow handling of this step implicitly). Kallisto (which I haven't really used) can perform quantification without alignment as described here (https://pachterlab.github.io/kallisto/starting.html)
  • Quantify (for example, running Kallisto, or RSEM)

Most likely yes, that would all happen on the terminal. I would suggest looking up some tutorials or existing pipelines using Kallisto. This one looks like a good start: https://felixeyegithubio.readthedocs.io/en/latest/rnaseq/labs/kallisto/

ADD COMMENT
0
Entering edit mode

Thank you for your reply, this gives me some good jumping off points!

ADD REPLY
3
Entering edit mode

If it helps..up vote the answer!

ADD REPLY
2
Entering edit mode
5.4 years ago

People typically run FASTQ files to generate counts rather than FASTA files. Yes, QC and trimming would be done before hand. Use FastQC to look at all of your FASTQ files. It will tell you if you need to trim any adaptors or if there are any other QC issues. It has both a simple GUI and a command-line version if you only have access to a headless linux server.

Trimming can be done with any number of tools, but Trim Galore is pretty popular and easy to use (and made to work with FastQC). It is run from the command line.

ADD COMMENT
0
Entering edit mode

Okay, I will do some more research into FastQC, thank you for clearing that up! I'm not familiar with Trim Galore yet, but I will definitely check it out.

ADD REPLY
2
Entering edit mode
5.4 years ago
Lior Pachter ▴ 700

See https://github.com/snakemake-workflows/rna-seq-kallisto-sleuth for a useful Snakemake workflow.

ADD COMMENT
0
Entering edit mode

Thank you! With this workflow, will I be able to identify differential gene expression? I understand that kallisto quantifies transcript level abundance - can I use the methods described in your 2018 paper (Gene-level differential analysis at transcript-level resolution) in addition to this workflow? I will be working with an annotated reference transcriptome, but there is no sequenced genome.

ADD REPLY
0
Entering edit mode

Scripts for performing the gene-level differential analysis from the paper you cited are here: https://github.com/pachterlab/aggregationDE You should be fine with the annotated reference transcriptome as there is no need for a genome for this to work.

ADD REPLY

Login before adding your answer.

Traffic: 2169 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6