Hello,
I am mapping RNA seq data using STAR and wanted to extract the unmapped reads to map against something else later in the pipeline. I used the following to create the genome to map to:
STAR --runThreadN 20 --runMode genomeGenerate --genomeDir /media/genome/ --genomeFastaFiles /media/genomic.fna --sjdbGTFfile /media/genomic.gff --sjdbGTFtagExonParentTranscript Parent --sjdbGTFfeatureExon Gene --sjdbOverhang 149 --genomeSAindexNbases 13
And the following to run the actual mapping:
for i in `ls -1 *_clean_R1.fastq | sed 's/_clean_R1.fastq//'` do STAR --runThreadN 64 --quantMode GeneCounts --outFileNamePrefix aligned/$i --outSAMtype BAM SortedByCoordinate --genomeDir /media/genome/ --readFilesIn $i\_clean_R1.fastq $i\_clean_R2.fastq done
I noticed when I tried to pull the unmapped reads from the bam files with samtools, they were all empty. When I run samtools flagstat
on the resulting BAM files, it is telling me that I have 0 unmapped reads?
Either my sampling and libraries were extraordinary or I'm doing something wrong, pretty sure it's the latter. Is there anything in the code that I used that would cause this? I know there is an option to output the unmapped reads to separate bam files in STAR, but I had always thought that the bam files still contain unmapped reads even if that option isn't used.
Thanks,
Erik