Hello,
I wanted to obtain only the "mapped" reads as an output of the STAR. I forgot to delete "--outSAMunmapped Within " and all of my mapped output files also contain all unmapped reads,too. Data size is huge and re-mapping properly wiil take so much time...
How can I fix the BAM files which were supposed to be mapped reads but also include unmapped reads in it ?
#!/bin/bash
mkdir /mnt/data/Toxo_scan/GBR_Male/ToxoMap
while read -r line
do
mkdir ToxoMap/$line
echo $line" -> Running STAR - Toxo now"
STAR --runThreadN 12 --alignIntronMax 1 --outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --genomeDir "/mnt/data/Toxo_scan/toxo_genome" --readFilesIn "/mnt/data/Toxo_scan/GBR_Male/sickle/"$line"_1_clean.fastq" "/mnt/data/Toxo_scan/GBR_Male/sickle/"$line"_2_clean.fastq" --outFileNamePrefix "/mnt/data/Toxo_scan/GBR_Male/ToxoMap/"$line"_" --outReadsUnmapped Fastx
done
Is the presence of unmapped reads causing issues? If you were going to do counts etc those reads will be ignored.
They were causing the issues because, I was supposed to process only the mapped reads in the workflow. For my case, presence of the unmapped reads in the mapped reads is like having false positive results into the true positive results...
Out of curiosity what was the downstream workflow component that was causing a problem? Most programs should understand reads that are unmapped easily.
The issue was not related with downstream workflow components, it was about my supervisor's satisfaction :) When I came with millions of undesired "unmapped" reads as a result of "mapped" reads, I was in the position of not doing the task well. That's all :)
Aha. No logical solution would suffice in that case.