Question

which annotation to use and is it advisable to use bam to fastq

0

Entering edit mode

7.5 years ago

novicebioinforesearcher ▴ 70

I need to reanalyze certain dataset(rna seq paired end stranded, 100bp) that has been already analyzed using a very old pipeline using mm9 reference (for all downstream analysis) and the files have been stored as bam(after alignment)

While using alignment tools like star or hisat2, which annotation should one use the very latest one for eg mm10.p7 or mm9? what are the factors that may influence analysis downstream
Since i do not have access to the original instrument data or the fastq files i would have to convert bam to fastq using bedtools, is this a best practice? would i loose any information when i re align the fastqs? what are the things i need to factor if I choose to do this? incase there is soft or hard clipping will this affect the conversion to fastq?

RNA-Seq alignment • 1.7k views

ADD COMMENT • link updated 7.4 years ago by WouterDeCoster 47k • written 7.5 years ago by novicebioinforesearcher ▴ 70

score 2 · Accepted Answer · 2017-07-11

2

Entering edit mode

7.5 years ago

dyollluap ▴ 310

If you have the resources I'd suggest doing both mm9 and mm10. Future proofing and also back compatible using the same pipeline.
You should find the relevant sequence run details, instrument data, etc., in the bam header @RG lines which can be used to deconvolute the bams to original fastq.

ADD COMMENT • link 7.5 years ago by dyollluap ▴ 310

score 2 · Accepted Answer · 2017-07-12

2

Entering edit mode

7.4 years ago

WouterDeCoster 47k

You also have samtools fastq to convert bam to fastq. Softclipping will not affect the fastq, but hardclipping would.

ADD COMMENT • link 7.4 years ago by WouterDeCoster 47k