Question

RNA-Seq: Getting Started with Kallisto

1

Entering edit mode

5.5 years ago

arussell3483 ▴ 30

Hello,

I am relatively new to bioinformatics and RNA-sequencing and am working on developing a workflow for my sequencing project. I am planning to use Kallisto and Sleuth as part of the analysis, but I am not sure how to get started. Will the quality control and trimming take place before I run the fasta file through kallisto? If so, does this step also take place in the Linux terminal?

Thanks!

RNA-Seq • 9.6k views

ADD COMMENT • link updated 5.5 years ago by Lior Pachter ▴ 700 • written 5.5 years ago by arussell3483 ▴ 30

score 3 · Answer 1 · 2019-07-11

Your reads would be in FASTQ format, not FASTA, though you will possibly have a transcriptome/genome reference in FASTA format.

A typical workflow I would say is something like:

Start with by getting some FASTQ files, have them in separate directories per sample (this is something that's done easily with terminal/bash commands, good opportunity to get familiar with that.)
You may run a QC tool (like FastQC + MultiQC) on the raw data, determine if you need to apply trimming
Apply trimming, re-run QC tools
Align your reads to a reference (giving you a BAM file) using an aligner (some quantification tools like RSEM allow handling of this step implicitly). Kallisto (which I haven't really used) can perform quantification without alignment as described here (https://pachterlab.github.io/kallisto/starting.html)
Quantify (for example, running Kallisto, or RSEM)

Most likely yes, that would all happen on the terminal. I would suggest looking up some tutorials or existing pipelines using Kallisto. This one looks like a good start: https://felixeyegithubio.readthedocs.io/en/latest/rnaseq/labs/kallisto/

score 2 · Answer 2 · 2019-07-11

2

Entering edit mode

5.5 years ago

jared.andrews07 ★ 18k

People typically run FASTQ files to generate counts rather than FASTA files. Yes, QC and trimming would be done before hand. Use FastQC to look at all of your FASTQ files. It will tell you if you need to trim any adaptors or if there are any other QC issues. It has both a simple GUI and a command-line version if you only have access to a headless linux server.

Trimming can be done with any number of tools, but Trim Galore is pretty popular and easy to use (and made to work with FastQC). It is run from the command line.

ADD COMMENT • link 5.5 years ago by jared.andrews07 ★ 18k

0

Entering edit mode

Okay, I will do some more research into FastQC, thank you for clearing that up! I'm not familiar with Trim Galore yet, but I will definitely check it out.

ADD REPLY • link 5.5 years ago by arussell3483 ▴ 30

score 2 · Answer 3 · 2019-07-12

2

Entering edit mode

5.5 years ago

Lior Pachter ▴ 700

See https://github.com/snakemake-workflows/rna-seq-kallisto-sleuth for a useful Snakemake workflow.

ADD COMMENT • link 5.5 years ago by Lior Pachter ▴ 700

0

Entering edit mode

Thank you! With this workflow, will I be able to identify differential gene expression? I understand that kallisto quantifies transcript level abundance - can I use the methods described in your 2018 paper (Gene-level differential analysis at transcript-level resolution) in addition to this workflow? I will be working with an annotated reference transcriptome, but there is no sequenced genome.

ADD REPLY • link 5.4 years ago by arussell3483 ▴ 30

0

Entering edit mode

Scripts for performing the gene-level differential analysis from the paper you cited are here: https://github.com/pachterlab/aggregationDE You should be fine with the annotated reference transcriptome as there is no need for a genome for this to work.

ADD REPLY • link 5.4 years ago by Lior Pachter ▴ 700