Entering edit mode
5.4 years ago
mgmohsen
•
0
I'm trying to perform an alignment using STAR
and extract gene count information. Here is the command I entered:
STAR --runThreadN 16 --readFilesCommand zcat --quantMode GeneCounts --genomeDir /path/to/genome/dir --readFilesIn /path/to/file_R1.fastq.gz /path/to/file_R2.fastq.gz
However, whenever I analyze ReadsPerGenes.out.tab
all columns are 0 all the way down.
$ head -10 ReadsPerGenes.out.tab
N_unmapped 0 0 0
N_multimapping 0 0 0
N_noFeature 0 0 0
N_ambiguous 0 0 0
ENSG00000223972.5 0 0 0
ENSG00000227232.5 0 0 0
ENSG00000278267.1 0 0 0
ENSG00000243485.5 0 0 0
ENSG00000284332.1 0 0 0
ENSG00000237613.2 0 0 0
All of the following commands provide no output:
$ awk '$2 > 0' ReadsPerGenes.out.tab
$ awk '$3 > 0' ReadsPerGenes.out.tab
$ awk '$4 > 0' ReadsPerGenes.out.tab
In case there could be something wrong with the genome indexing, the genome was indexed using the following command:
STAR --runThreadN 16 --runMode genomeGenerate --genomeDir . --genomeFastaFiles ./GRCh38.primary_assembly.genome.fa --sjdbGTFfile ./gencode.v31.primary_assembly.annotation.gtf
Any ideas what could be going wrong here?
Hm, but I've given it in a gtf in the genome indexing step. Does it also need to be specified during alignment?
Basic question - chromosome identifiers in your GTF file and your reference match, correct?
Yes they match. And also, I just tried aligning some other fastq sets and those seem to work normally. It seems like there might be something wrong with this set of fastqs in particular.
Good to know. Grab a few reads and align them using BLAST. That would be the easiest place to start by checking if that data is from the right genome.