Entering edit mode
2.9 years ago
aimanbarki
▴
20
Hello Everyone,
Is there any tutorial for Following:
- how to download Fastq file from NCBI
- how to check the file quality (How they needs to be?)>
- How to use cell ranger count on Fastq file?
- How to understand the output of the count?
I want to work with the healthy data set from the following website:
I downloaded the fastq file using following command:
fastq-dump --split-files --gzip SRR10134390
I downloaded the reference from Gencode and make ref for cellranger count using following command
mkref --genome=GRCh38.p13 --fasta=GRCh38.primary_assembly.genome.fa --genes=gencode.v39.primary_assembly.annotation.gtf
I ran the cellranger count using the following command:
cellranger count --id=Healthy_aortic_valve2 --fastqs=/healthy1 --transcriptome=GRCh38.p13 --chemistry SC3Pv2
This commands run and created several folders but it does not seem right . because I can not find matrix files, and or BAM files.
Can someone tell me how I can find out the problem?
Thanks
Posting an error message or such is probably a good start. Beyond that, you'd probably get a lot out of the OSCA book in terms of understanding and performing scRNA-seq analysis.
are you looking in the correct path for the output files? If
cellranger count
ran successfully it should write the output toHealthy_aortic_valve2/outs
according to documentation.@ Jv The run created the . But it does not include the / outs direcotry. Now to find out the issue, from where I should start?
Thanks in advance
Input files for
cellranger
need to be in a specific format with the index sequences in separate files. You can find more information about that types and names of files here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/fastq-inputSimply splitting the SRA data may not give you the correct input files. Unfortunately these submitters appear to have not submitted original
cellranger
BAM file which would have allowed you to recreate the fastq files easily.The index sequence doesn't have to be present anymore. It's just a legacy thing that cellranger's mkfastq makes it. (What does matter is that the fastqs be named exactly according to the Illumina standard)
Good to know. We generally demux using cellranger so have the files.
GenoMax and @swbarnes2 I changed the name of the files but i am attaching the pic how . I think the "+line" does not suppose to look like that or is it fine?
That should be fine. If you had used
-F
(original format option) when dumping the reads out they may look like normal illumina fastq headers (depending on how the submitters sent the data in).cellranger
is supposed to only use 26 or 28 bp of read 1 based on chemistry.Do you have an extra
_
in the file names beforeS1
? You should remove that.That's how the folders are named when cellranger has yet to finish running properly.