Hi all,
I've been generating BAM files using the STAR command below:
for ((i=102; i<=305; i++)); do ./STAR --readFilesIn fastq/E08/SRR2930$i.fastq --outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --outSAMmultNmax 1 --genomeDir STARindex --twopassMode Basic --runThreadN 12 --outFileNamePrefix 2930$i; done
My fastq files are about 180mb and generated BAM files are about 75kb. I checked BAM files using samtools stats --split RG 2930107Aligned.sortedByCoord.out.bam | grep '^SN'
and saw the following output for all BAM files:
SN raw total sequences: 0
SN filtered sequences: 0
SN sequences: 0
SN is sorted: 1
SN 1st fragments: 0
SN last fragments: 0
SN reads mapped: 0
SN reads mapped and paired: 0 # paired-end technology bit set + both mates mapped
SN reads unmapped: 0
SN reads properly paired: 0 # proper-pair bit set
SN reads paired: 0 # paired-end technology bit set
SN reads duplicated: 0 # PCR or optical duplicate bit set
SN reads MQ0: 0 # mapped and MQ=0
SN reads QC failed: 0
SN non-primary alignments: 0
SN total length: 0 # ignores clipping
SN bases mapped: 0 # ignores clipping
SN bases mapped (cigar): 0 # more accurate
SN bases trimmed: 0
SN bases duplicated: 0
SN mismatches: 0 # from NM fields
SN error rate: 0.000000e+00 # mismatches / bases mapped (cigar)
SN average length: 0
SN maximum length: 30
SN average quality: 0.0
SN insert size average: 0.0
SN insert size standard deviation: 0.0
SN inward oriented pairs: 0
SN outward oriented pairs: 0
SN pairs with other orientation: 0
SN pairs on different chromosomes: 0
Also, when I run samtools view 2930107Aligned.sortedByCoord.out.bam
command, nothing shows up in terminal.
So, I was wondering what might be cause for the problem here, fastq files themselves or the code that generates BAM files? I know that the question is quite vague, sorry for that, but I'm not sure where to look at to solve the problem.
Edit: Thanks a lot for your helps. I've run a single file, STAR worked without any problems on terminal.
./STAR --readFilesIn fastq/E08/SRR2930160.fastq --outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --outSAMmultNmax 1 --genomeDir STARindex --twopassMode Basic --runThreadN 12 --outFileNamePrefix 2930160
Apr 04 15:55:10 ..... started STAR run
Apr 04 15:55:10 ..... loading genome
Apr 04 15:55:26 ..... started 1st pass mapping
Apr 04 15:55:29 ..... finished 1st pass mapping
Apr 04 15:55:30 ..... inserting junctions into the genome indices
Apr 04 15:57:23 ..... started mapping
Apr 04 15:57:26 ..... finished mapping
Apr 04 15:57:27 ..... started sorting BAM
Apr 04 15:57:27 ..... finished successfully
And here is the log.
Edit 2: I asked about this issue to STAR developers on github and apparently my data is from Solid sequencer and STAR does not support that, here is a link to the issue. Thanks a lot for all comments though, cheers.
Thanks, Gökberk
This does not make sense. Are files collectively 180Mb or each file is 180Mb? Look through the log files to see what is going on. Should be reasonably obvious.
Each fastq file is about 180mb and yes, as you said it doesn't make sense. Could it be that my index genome or something is problematic so that these BAM files are all corrupted?
Can you post sections from log that look like some sort of error?
Did you make
STAR
indexes yourself? Was there any error generated then or you did not specifically look?I generated the index genome using STAR, but did not receive any errors or warnings while generating it. Here is a part of the log:
So we will assume that your
STAR
index is properly made. Can you run a single file against this index and let us see what happens. Make sure you capture a log so we can look through it alignment fails. If log file is large you can post it on pastebin.com and paste that link here.When in doubt, run 1 sample manually. STAR is not at fault, so if the output isn't correct it's because you've done something wrong.