Hi all I been testing a paired RNA.seq sample. I start with a paired of FASTQ files and do alignment with STAR and run STAR-Fusion. Everything looks good until I try to reverse the BAM files generated from STAR originally to fastq and rerun STAR-Fusion; this resulted in about half the fusions that as originally called. So making a mistake somewhere when converting from BAM to fastq. So far I tried several tools including samtools fastq, picard RevertSam to a ubam and then SamToFastq, I also tried bamtofastq but nothing works. Here is what I did with the original STAR alingment.
STAR --genomeDir /index/hg38.p5/ \
--readFilesIn T1_1.fq.gz T1_2.fq.gz \
--readFilesCommand zcat --outFileNamePrefix T1 --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts --sjdbGTFfile /index/hg38.gtf
this is how the Bam looks like.
samtools flagstat T1Aligned.sortedByCoord.out.bam
121206267 + 0 in total (QC-passed reads + QC-failed reads)
11756519 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
121206267 + 0 mapped (100.00% : N/A)
109449748 + 0 paired in sequencing
54766664 + 0 read1
54683084 + 0 read2
109354332 + 0 properly paired (99.91% : N/A)
109354332 + 0 with itself and mate mapped
95416 + 0 singletons (0.09% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
Here is what I tried with samtools.
samtools sort -n --threads 4 ./T1Aligned.sortedByCoord.out.bam|samtools fastq -1 T1_R1_.fq -2 T1_R2_.fq -0 test.fq -
picard
/picard-tools-2.2.4/picard.jar RevertSam I=T1Aligned.sortedByCoord.out.bam O=T1.ubam QUIET=true VALIDATION_STRINGENCY=SILENT
/picard-tools-2.2.4/picard.jar SamToFastq INPUT=T1.ubam FASTQ=T1_R1_.fq SECOND_END_FASTQ=T2_R2_.fq VALIDATION_STRINGENCY=SILENT
I also tried bamtofastq but nothing seem to be able to convert the bam back to where I was able to call the fusions correctly. Any suggestions? Thanks!
I rarely use STAR-Fusion, but I'd assume the reconstructed fastqs are missing unmapped reads from the normal alignment, which contain reads that are used to build chimeric and circular alignments. You might need to use
--outSAMunmapped Within
.this looks promising. I'm going try this and report back. Thanks.
Cross-posted on SeqAnswers.
What you can try (and what is actually recommended) is to exclude everything that is not a primary alignment, using the flag
-F 2304
insamtools fastq
.It's been an eternity, but isn't it -f 64 for R1 and -f 128 for R2 ; per the sam specifications?