Error with BiocParallel. No barcodes files found
1
1
Entering edit mode
7 months ago
bio_info ▴ 20

I am currently trying to implement a pipeline using SingleCellExperiment and DropletUtils in R. I have been getting an error for while now and I have tired everything to fix it, but nothing works. This is the code I am currently running.

seq_data_martin <- "/home/.../GSE134809_RAW"

samples <- list.files(path = seq_data_martin, pattern = "*matrix.mtx.gz")
samples <- gsub("_matrix.mtx.gz$", "", samples)

sce <- DropletUtils::read10xCounts(
  samples = paste0("seq_data_martin/", samples, "_"),
  sample.names = samples,
  type = "prefix",
  BPPARAM = BiocParallel::MulticoreParam()
)

The error I am getting upon executing 'sce' is:

Error: BiocParallel errors
  31 remote errors, element index: 1, 2, 3, 4, 5, 6, ...
  0 unevaluated and other errors
  first remote error:
Error in .check_for_compressed(barcode.loc, compressed): cannot find 'seq_data_martin/GSM3972009_69_barcodes.tsv' or its gzip-compressed form

I have tried unzipping the barcodes, changing permissions of the directories but the error still persists. The barcode files are correctly located in the path:

/GSE134809_RAW$ ll | head
total 592568
drwxrwxr-x  2 localadmin localadmin     4096 May 23 14:19 ./
drwxr-xr-x 39 localadmin localadmin     4096 May 23 14:16 ../
-rw-rw-r--  1 localadmin localadmin  1155350 May 23 14:19 GSM3972009_69_barcodes.tsv.gz
-rw-rw-r--  1 localadmin localadmin   264786 Jul 24  2019 GSM3972009_69_genes.tsv.gz
-rw-rw-r--  1 localadmin localadmin 12020368 Jul 24  2019 GSM3972009_69_matrix.mtx.gz
-rw-rw-r--  1 localadmin localadmin  1920076 Jul 24  2019 GSM3972010_68_barcodes.tsv.gz
-rw-rw-r--  1 localadmin localadmin   264786 Jul 24  2019 GSM3972010_68_genes.tsv.gz
-rw-rw-r--  1 localadmin localadmin 21788688 Jul 24  2019 GSM3972010_68_matrix.mtx.gz
-rw-rw-r--  1 localadmin localadmin  2280347 Jul 24  2019 GSM3972011_122_barcodes.tsv.gz

Can someone please help me fix this error?

Barcodes scRNA-seq SingleCellExperiment • 734 views
ADD COMMENT
0
Entering edit mode

cross-posted: https://stackoverflow.com/questions/78522566/

Do not post on multiple forums - that is just bad etiquette. What's worse, you did not link between the two so no one knows that you're asking two sets of online volunteers to spend their time on your problem without telling them that you're also asking the other group.

ADD REPLY
1
Entering edit mode

I understand, thanks for explaining this to me. I'll take care of this in future.

ADD REPLY
2
Entering edit mode
7 months ago
ATpoint 86k

The function assumes simply barcodes.tsv(.gz), genes.tsv(.gz) and matrix.mtx(.gz), without additional pre- or suffixes by default. You can use the prefix argument to let it know about the suffix, e.g. GSM3972009_69_ but then (no problem) you need to run it once per prefix and then maybe cbind the SingleCellExperiment objects together. Or you do it all manually. After all, it is nothing but read.delim(...) for the tsv files, Matrix::readMM() for the mtx files, and then build your SingleCellExperiment manually.

ADD COMMENT
0
Entering edit mode

Thanks for the explanation! If I remove the prefix (GSM3972009_69_) then I will lose the sample annotation and cannot track them. I am more familiar with Seurat rather than SingleCellExperiment. Does the SingleCellExperiment object store the metadata somewhere with the original sample names? (Example, GSM3972009_69)

ADD REPLY
0
Entering edit mode

You just tell the function to remove the prefix for you. Not on the actual file. Then you can add this information to the SCE, like sce$dataset <- something. SingleCellExperiment stores per-cell data as colData. colData(sce). Why don't you use Seurat if you are more familiar?

ADD REPLY
0
Entering edit mode

Thank you for your help! I will try this out. I am trying to implement a collaborator's pipeline on my data and they only use SCE, scran and scater, hence why I am not using Seurat.

ADD REPLY

Login before adding your answer.

Traffic: 1925 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6