Importing Kallisto counts with tximport
0
0
Entering edit mode
4.6 years ago

Hi all. I´m using Kallisto with DeSeq2 to do a DGE analysis on my samples. I used the following commands:

kallisto index -i cro_kal_index $StringTieDir/merged_transcriptome.fasta

//for each sample 
kallisto quant -i cro_kal_index -o mapping_to_genome/07_kallisto/idio_1 $DATA/R1_cut_paired.fastq.gz $DATA/R2_cut_paired.fastq.gz

DeSeq2 manual advises to use tximport to import the data files to downstream analysis However, I get this error:

Error in H5Fopen(file, "H5F_ACC_RDONLY", native = native) : 
HDF5. File accessibilty. Unable to open file

My first approach was to use the tsv files instead of the h5 files. However, the same error still occurs.

The files were generated in a Linux server and transferred over by scp to Windows. I checked the md5 sums and everything seems fine. The thing that puzzles me the most is an error related to the rhdf5 library occurring with tsv files.

Below is my script with the session info:

library(tximport)
library(rhdf5)

files <- "abundance.h5"
names(files) <- paste0("sample", 1)
txi.kallisto <- tximport(files, type = "kallisto", txOut = TRUE)

#Error in H5Fopen(file, "H5F_ACC_RDONLY", native = native) : 
#HDF5. File accessibilty. Unable to open file
#md5 sum in server : a0810ec7b108ac4948d3ad0bbd697d63 
#md5 sum in my pc: a0810ec7b108ac4948d3ad0bbd697d63

files <- "abundance.tsv"
names(files) <- paste0("sample", 1)
txi.kallisto <- tximport(files, type = "kallisto", txOut = TRUE)


#Note: importing `abundance.h5` is typically faster than `abundance.tsv`
#reading in files with read_tsv
#Error in H5Fopen(file, "H5F_ACC_RDONLY", native = native) : 
#HDF5. File accessibilty. Unable to open file.

#md5 sum in server : 442bf73d9eab9bfa28d9e1ab13d59b08 
#md5 sum in pc :     442bf73d9eab9bfa28d9e1ab13d59b08

# R version 3.6.1 (2019-07-05)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 18362)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
# [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
# [5] LC_TIME=English_United Kingdom.1252    
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] rhdf5_2.30.1    tximport_1.14.2
# 
# loaded via a namespace (and not attached):
#   [1] Rcpp_1.0.4.6        crayon_1.3.4        R6_2.4.1            lifecycle_0.2.0    
# [5] magrittr_1.5        pillar_1.4.3        rlang_0.4.5         rstudioapi_0.11    
# [9] vctrs_0.2.4         ellipsis_0.3.0      Rhdf5lib_1.8.0      tools_3.6.1        
# [13] readr_1.3.1         glue_1.4.0          hms_0.5.3           compiler_3.6.1     
# [17] pkgconfig_2.0.3     BiocManager_1.30.10 tibble_3.0.1       
# 
#

Yes I have seen other users to have a similar error, but I could not get a solution. If there is any post with the exact same error I have missed let me know.

Best

transcriptomics kallisto rhdf5 • 2.4k views
ADD COMMENT
0
Entering edit mode

I would post it over at https://support.bioconductor.org/ since it is a Bioc package.

ADD REPLY

Login before adding your answer.

Traffic: 3103 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6