Question

Empty BAM file upon mapping with STAR

2

Entering edit mode

5.5 years ago

makwana.kd ▴ 60

I am using to align RNA sequence reads to the reference genome I created using STAR. I have done the QC of every fastq file before running the mapping job. The problem I am encountering is that some of the fastq files are generating empty BAM files, whereas, other files have no problem. I am hereby attaching Log.final.out files. The first one is for the file for which I had no problem:

more Log.final.out2 Started job on | May 14 16:55:48 Started mapping on | May 14 16:59:56 Finished on | May 14 17:06:24 Mapping speed, Million of reads per hour | 97.45

                      Number of input reads |       10503242
                  Average input read length |       66
                                UNIQUE READS:
               Uniquely mapped reads number |       6887932
                    Uniquely mapped reads % |       65.58%
                      Average mapped length |       65.33
                   Number of splices: Total |       471329
        Number of splices: Annotated (sjdb) |       460539
                   Number of splices: GT/AG |       462632
                   Number of splices: GC/AG |       3610
                   Number of splices: AT/AC |       302
           Number of splices: Non-canonical |       4785
                  Mismatch rate per base, % |       0.49%
                     Deletion rate per base |       0.01%
                    Deletion average length |       1.93
                    Insertion rate per base |       0.02%
                   Insertion average length |       1.83
                         MULTI-MAPPING READS:
    Number of reads mapped to multiple loci |       2756203
         % of reads mapped to multiple loci |       26.24%
    Number of reads mapped to too many loci |       38499
         % of reads mapped to too many loci |       0.37%
                              UNMAPPED READS:
   % of reads unmapped: too many mismatches |       0.00%
             % of reads unmapped: too short |       7.23%
                 % of reads unmapped: other |       0.58%
                              CHIMERIC READS:
                   Number of chimeric reads |       0
                        % of chimeric reads |       0.00%

The second Log.final.out file looks like the following: more Log.final.out Started job on | May 14 17:25:16 Started mapping on | May 14 17:33:08 Finished on | May 14 17:33:10 Mapping speed, Million of reads per hour | 0.00

                      Number of input reads |       0
                  Average input read length |       0
                                UNIQUE READS:
               Uniquely mapped reads number |       0
                    Uniquely mapped reads % |       0.00%
                      Average mapped length |       0.00
                   Number of splices: Total |       0
        Number of splices: Annotated (sjdb) |       0
                   Number of splices: GT/AG |       0
                   Number of splices: GC/AG |       0
                   Number of splices: AT/AC |       0
           Number of splices: Non-canonical |       0
                  Mismatch rate per base, % |       -nan%
                     Deletion rate per base |       0.00%
                    Deletion average length |       0.00
                    Insertion rate per base |       0.00%
                   Insertion average length |       0.00
                         MULTI-MAPPING READS:
    Number of reads mapped to multiple loci |       0
         % of reads mapped to multiple loci |       0.00%
    Number of reads mapped to too many loci |       0
         % of reads mapped to too many loci |       0.00%
                              UNMAPPED READS:
   % of reads unmapped: too many mismatches |       0.00%
             % of reads unmapped: too short |       0.00%
                 % of reads unmapped: other |       0.00%
                              CHIMERIC READS:
                   Number of chimeric reads |       0
                        % of chimeric reads |       0.00%

I have not modified the command for the above two jobs (apart from the fact that these files are in different directories ). Here is the generic command: STAR --genomeDir /users/PFS0231/cls0226/Createindex2/ --runThreadN 1 --readFilesIn /users/PFS0231/cls0226/output/ALzt14-1/232/ file.fastq --outSAMtype BAM Unsorted

I am so perplexed why am I having this problem. Any help is really appreciated.

Thanks

RNA-Seq • 2.8k views

ADD COMMENT • link updated 5.5 years ago by manuel.belmadani ★ 1.4k • written 5.5 years ago by makwana.kd ▴ 60

score 1 · Answer 1 · 2019-05-14

1

Entering edit mode

5.5 years ago

manuel.belmadani ★ 1.4k

--readFilesIn /users/PFS0231/cls0226/output/ALzt14-1/232/ file.fastq

Looks like there's a whitespace between your directory and the file basename.

Try:

--readFilesIn /users/PFS0231/cls0226/output/ALzt14-1/232/file.fastq

ADD COMMENT • link 5.5 years ago by manuel.belmadani ★ 1.4k

1

Entering edit mode

Also for future posts, note how easier this is to spot when using the code blocks versus the plain text version you posted.

ADD REPLY • link 5.5 years ago by manuel.belmadani ★ 1.4k

0

Entering edit mode

Thank you so much Manuel. It was such a silly mistake and now I feel stupid. Your time and help is much appreciated.

ADD REPLY • link 5.5 years ago by makwana.kd ▴ 60

1

Entering edit mode

No problem! :) I've done similar if not exactly the same thing more than once haha.

Mark the answer as accepted if the issue is resolved.

ADD REPLY • link 5.5 years ago by manuel.belmadani ★ 1.4k