Entering edit mode
1 day ago
king
•
0
I need to retrieve and download a large number of fastq files from the GEO or SRA database. Is there a way to determine whether it is 10x data before running cellranger?
Just asking with all due respect, does it make a lot of sense to batch download and process a large number of datasets without actually doing a careful and thoughtful curation of metadata first, to ensure that you're actually working with data that can asnwer your scientific question?
Can you provide a few example accessions? It may be possible to do this by looking at the metadata. Fastq data for 10x in SRA can be hit or miss in general.
You could just use kallisto / bustools or alevin to process a few hundred thousand reads and see what the results look like (obviously, if it's not the correct technology, then your resulting count matrix will have very few barcodes and very counts). Those programs are much faster than cellranger and could be a good way to check whether a set of FASTQ files is 10x, before diving into running cellranger.
Note that there are many versions of the 10x protocol.