Velocyto: Not found cell and umi barcode in entry of the bam file
1
0
Entering edit mode
3.4 years ago
bs58 ▴ 10

I'm trying to get the fraction of spliced and unspliced genes to after calculate the RNA velocity with velocyto.

When I run this command:

velocyto run -u Gene -o ./Data_RNAv ./data1.bam ./GenomeIndex/gencodev38annotation.gtf

I get the following Error message:

 The bam file does not contain cell and umi barcodes appropriatelly formatted. 

This is my workflow so far:

  1. Downloaded the two fastq files using the sratoolkit

  2. Downloaded hg38.fa and the reference .gtf file

  3. Created the genome index using STAR

Like this:

STAR --runMode genomeGenerate  --genomeDir ./GenomeIndex --genomeFastaFiles ./GenomeInde /hg38.fa --sjdbGTFfile ./GenomeIndex/gencodev38annotation.gtf
  1. Aligned the genome using STAR

Like this:

STAR --runThreadN 24 --genomeDir ./GenomeIndex --sjdbGTFfile ./GenomeIndex/gencodev38annotation.gtf --sjdbOverhang 100 --outSAMtype BAM Unsorted --readFilesIn ./data_Day4/SRR9127057_S1_L001_R1_001.fastq ./H9_D4/SRR9127057_S1_L001_R2_001.fastq 
  1. Using velocyto.py to writing out a standard loom file: and here is where I get the error saying that the UMI is not found in the bam file

What did I do wrong?

velocyto scRNA-seq UMI RNA • 2.3k views
ADD COMMENT
0
Entering edit mode

Probably the SRA data does not have UMI/Barcode sequence in the header. You can check that information in the fastq header.

ADD REPLY
0
Entering edit mode

This is the begining of the fastq file that I have

>head ./data_Day4/SRR9127057_S1_L001_R1_001.fastq
@SRR9127057.1 A00291:31:H5W5MDMXX:2:1101:1579:1000 length=25
TGTTACCCNGCTCGTCGTTATGCCG
+SRR9117954.1 A00291:31:H5W5MDMXX:2:1101:1579:1000 length=25
,#FFFFFFF:FF:FFFFFFFFFFFF
@SRR9127057.2 A00291:31:H5W5MDMXX:2:1101:2338:1000 length=25
NGCTGTCCAAGGAAGCTAGTCCACT
+SRR9117954.2 A00291:31:H5W5MDMXX:2:1101:2338:1000 length=25
F#FFFFFFFFFFFFFFFF:FFFFFF
@SRR9127057.3 A00291:31:H5W5MDMXX:2:1101:3007:1000 length=25
TNAACTTTGCGTGGTCTCCTCAAGC

How can I know if it has the UMI/Barcode?

ADD REPLY
0
Entering edit mode
3.4 years ago

From the description, this was a 10xGenomics single cell dataset. The cell barcode and UMI information is in read 2, but STAR doesn't understand that. Either use STARSolo, or cellranger.

ADD COMMENT

Login before adding your answer.

Traffic: 2661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6