Entering edit mode
2.2 years ago
Ngrin
•
0
Hello,
I have some covid samples. I have run below command to get reads per gene and also bam files.
STAR --runThreadN 20 --readFilesCommand zcat --genomeDir /star/genomeIndex --readFilesIn R1 R2 --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts
After running the above command and using the provided genome files here almost all samples have a very high number of unmaped reads.
N_unmapped 33570998 33570998 33570998
N_multimapping 1589 1589 1589
N_noFeature 8987 9775 9680
N_ambiguous 225 61 95
Is there anything that should I change? Is this high number normal?
No, this is not "normal". What is this dataset, are you sure you are using the correct genome?
I have added a link in my original post to the genome files I am using. These are human samples (paired end reads). I wonder why this much unmapped reads I have. Do I need to do any further step?
This is what I use to create genome index files.
Hello there, @Negara. Have you checked whether the fasta and gtf's chromosome names match?
yes in both the chromosome names are in the same format as chr1, chr2, etc.