How to recreate FASTQ from BAM
0
0
Entering edit mode
6.0 years ago
simplitia ▴ 130

Hi all I been testing a paired RNA.seq sample. I start with a paired of FASTQ files and do alignment with STAR and run STAR-Fusion. Everything looks good until I try to reverse the BAM files generated from STAR originally to fastq and rerun STAR-Fusion; this resulted in about half the fusions that as originally called. So making a mistake somewhere when converting from BAM to fastq. So far I tried several tools including samtools fastq, picard RevertSam to a ubam and then SamToFastq, I also tried bamtofastq but nothing works. Here is what I did with the original STAR alingment.

STAR --genomeDir /index/hg38.p5/ \
--readFilesIn T1_1.fq.gz T1_2.fq.gz \
--readFilesCommand zcat --outFileNamePrefix T1 --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts --sjdbGTFfile /index/hg38.gtf

this is how the Bam looks like.

samtools flagstat T1Aligned.sortedByCoord.out.bam

121206267 + 0 in total (QC-passed reads + QC-failed reads)
11756519 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
121206267 + 0 mapped (100.00% : N/A)
109449748 + 0 paired in sequencing
54766664 + 0 read1
54683084 + 0 read2
109354332 + 0 properly paired (99.91% : N/A)
109354332 + 0 with itself and mate mapped
95416 + 0 singletons (0.09% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Here is what I tried with samtools.

samtools sort -n --threads 4 ./T1Aligned.sortedByCoord.out.bam|samtools fastq -1 T1_R1_.fq -2 T1_R2_.fq -0 test.fq -

picard

/picard-tools-2.2.4/picard.jar RevertSam I=T1Aligned.sortedByCoord.out.bam O=T1.ubam QUIET=true VALIDATION_STRINGENCY=SILENT

/picard-tools-2.2.4/picard.jar SamToFastq INPUT=T1.ubam FASTQ=T1_R1_.fq SECOND_END_FASTQ=T2_R2_.fq VALIDATION_STRINGENCY=SILENT

I also tried bamtofastq but nothing seem to be able to convert the bam back to where I was able to call the fusions correctly. Any suggestions? Thanks!

RNA-Seq BAM sequencing • 4.3k views
ADD COMMENT
2
Entering edit mode

I rarely use STAR-Fusion, but I'd assume the reconstructed fastqs are missing unmapped reads from the normal alignment, which contain reads that are used to build chimeric and circular alignments. You might need to use --outSAMunmapped Within.

ADD REPLY
0
Entering edit mode

this looks promising. I'm going try this and report back. Thanks.

ADD REPLY
0
Entering edit mode

Cross-posted on SeqAnswers.

What you can try (and what is actually recommended) is to exclude everything that is not a primary alignment, using the flag -F 2304 in samtools fastq.

ADD REPLY
0
Entering edit mode

It's been an eternity, but isn't it -f 64 for R1 and -f 128 for R2 ; per the sam specifications?

ADD REPLY

Login before adding your answer.

Traffic: 1547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6