Dear all,
I am aligning paired end reads from two closely related chicken breeds against chicken reference genome. In each bread, I have 5 individuals (samples), each individual has two files (R1 and R2).
When I run alignment by STAR using this code: I am running them in the directory containing all fastq files of the 5 samples using a for in loop:
STAR --runMode alignReads --genomeDir IndexRef/ --outSAMtype BAM SortedByCoordinate --readFilesIn ${file} --outFileNamePrefix mapped/L10/${file} --runThreadN 12
it produced for each sample, two file that ends with (R1.Aligned.sortedByCoord.out.bam
and R2 Aligned.sortedByCoord.out.bam
). Now I know that these two files is unsorted BAM, each have statistics on % mapping,etc. I am confused which one of these 2 is considered a final alignment file for this sample? Do these two files combine after that when running samtools on them? I assume that there should be single BAM file to be considered as an aligment to be analyzed and visualized using genome browser ?
Another question: My two breeds are two closely related breeds, so senstivity is important to pick up SNPs differ between both, so you think the above code is doing highly sensitive alignment? Or do I need to add more options?
Thanks
I actually tried to run the 5 samples (each have paired end) simultaneously sing bash script: in the script my code was:
This actually produced just one BAM file! I was expecting 5 BAM files for the 5 samples. Any comment on the code? Shall I leave space after the command?
Thanks
As noted by @swbarnes2 You need to use a unique name in
--outFileNamePrefix mapped/L10/**UNIQUE_NAME_HERE**
in each command to ensure that five sets of result files will end up with unique names.Can you explain exactly what part of the code you think tells the software to make 5 different bams, instead of rewriting over the same one over and over again?