Question

How to merge Kallisto TSV files for use with Tximport

1

Entering edit mode

5.5 years ago

Colari19 ▴ 90

I have pseudo-aligned some RNASeq reads using Kallisto where each sample was split over two flow cells.

I pseudo-aligned the fastq files for each flow cell separately, meaning for each sample I have two abundance TSV files that look like this:

target_id           length  eff_length  est_counts  tpm
ENST00000434970.2   9       7           0           0
ENST00000448914.1   13      11          0           0
ENST00000415118.1   8       6           1           0
ENST00000631435.1   12      10          0           0
ENST00000632684.1   12      10          0           0

I would like to merge each sample's two TSV files into one and aggregate the transcript level counts to gene-level counts with tximport in R.

I'm unsure about the best way to approach this. The length column is the same in both TSV files, and the est_counts just need to be added together, but I'm not sure about eff_length and tpm. Does tximport require this information?

I would appreciate some advice.

Thank you!

RNA-Seq R Kallisto Tximport • 2.8k views

ADD COMMENT • link 5.5 years ago by Colari19 ▴ 90

1

Entering edit mode

Do you intend on using DESeq2 afterwards? If so, one potential option you can explore is the collapseReplicates function of DESeq2. I know this is not the direct answer here since it is after the usage of tximport, but just an FYI for you to explore. Alternatively, why did you not merge these earlier? I normally use Salmon, which lets me add technical replicates at the mapping step, resulting in a single file, and I would be surprised if Kallisto doesn't have a way to do this as well.

ADD REPLY • link 5.5 years ago by lshepard ▴ 480