Question

rseqc disagreement in experiment type by bed files

1

Entering edit mode

4.2 years ago

Folder40g ▴ 190

I'm trying to infer the the type of experiment for my paired-end RNAseq data. I have employed STAR to index the genome and map the paired reads (as unstranded) to derive the experiment. Depending on the bed file I employ to infer the experiment I get different results. As genome I'm employing the genecode:

GRCm39.primary_assembly.genome.fa

Anyone can guess why are these differences happening?

Method 1 (bed converted from genecode gtf)

Convert gtf to bed

bedparse gtf2bed gencode.vM27.annotation.gtf > gencode.vM27.annotation.gtf.bed

Code

infer_experiment.py -r /local/ref/gencode.vM27.annotation.gtf.bed -i /local/out/star/ADU2124Aligned.sortedByCoord.out.bam

Result:

This is PairEnd Data
Fraction of reads failed to determine: 0.1017
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0080
Fraction of reads explained by "1+-,1-+,2++,2--": 0.8903

Method (bed file provided by rseqc)

Code

infer_experiment.py -r /local/ref/mm10_Gencode_VM18.bed -i /local/out/star/ADU2124Aligned.sortedByCoord.out.bam

Result:

This is PairEnd Data
Fraction of reads failed to determine: 0.0577
Fraction of reads explained by "1++,1--,2+-,2-+": 0.3863
Fraction of reads explained by "1+-,1-+,2++,2--": 0.5560

Thanks!

rseqc • 1.4k views

ADD COMMENT • link updated 4.2 years ago by GenoMax 152k • written 4.2 years ago by Folder40g ▴ 190

score 1 · Accepted Answer · 2021-05-26

1

Entering edit mode

4.2 years ago

Folder40g ▴ 190

Well, it seems that I was employing the wrong bed from rseqc and I didn't notice until now...

BED: mm10_Gencode_VM18.bed

Reference: GRCm39.primary_assembly.genome.fa

ADD COMMENT • link 4.2 years ago by Folder40g ▴ 190

1

Entering edit mode

Don't worry. We have all been there at one time or other :-)

I assume your problem has been solved? If so I can move your comment to an answer which you can accept to provide closure to this thread.

ADD REPLY • link 4.2 years ago by GenoMax 152k