Question

What's the possible causes of Unbalanced strand mapping in RNA-seq?

0

Entering edit mode

7.0 years ago

salvatore.digiorgio ▴ 10

I'm trying to develop pipeline to search for novel RNA editing events.

enter image description here

The last 4 columns are Forward_ref_cov, Forward_alt_cov, Reverse_ref_cov, Reverse_alt_cov

Could be artifact? How can i remove these kind of position??

RNA-Seq strand bias • 2.0k views

ADD COMMENT • link 7.0 years ago by salvatore.digiorgio ▴ 10

1

Entering edit mode

Can you do this instead of the link above: How to add images to a Biostars post

ADD REPLY • link 7.0 years ago by GenoMax 153k

0

Entering edit mode

Do you mean why the reads only align to the reverse strand and not the forward? Because RNA-seq or rather transcription is an orientation-specific event, and if a strand-aware library prep was used, then you see exactly what you see. If you need more details, please give more information on what you want to do.

ADD REPLY • link 7.0 years ago by ATpoint 89k

0

Entering edit mode

Yes, I mean that some position i found show reads aligned only in one strand. I'm using TCGA RNA-seq data, and I read that unstranded library preparation was used, indeed most of my positions show mapping reads either in forward or in revers strand. enter image description here

ADD REPLY • link 7.0 years ago by salvatore.digiorgio ▴ 10

0

Entering edit mode

I moved your reply to comment in order to keep this thread organized. Can you provide a link to the exact source of the data?

ADD REPLY • link 7.0 years ago by ATpoint 89k

0

Entering edit mode

The link of my data is https://portal.gdc.cancer.gov/. The information about data i used is at https://cancergenome.nih.gov/abouttcga/aboutdata/platformdesign RNASeqV2

ADD REPLY • link 7.0 years ago by salvatore.digiorgio ▴ 10

0

Entering edit mode

and where does it say that this is an unstranded library? Could not find that right away following the link..

ADD REPLY • link 7.0 years ago by ATpoint 89k

0

Entering edit mode

I'm sorry, the TCGA website is messy, and is difficult find the informations. This is what I found: http://embor.embopress.org/content/embor/early/2014/02/17/embr.201337950/DC26/embed/inline-supplementary-material-26.pdf?download=true

and on this blog there is just answered question at this link: A: Is TCGA PRAD RNA-seq data strand specific?

ADD REPLY • link 7.0 years ago by salvatore.digiorgio ▴ 10

0

Entering edit mode

Did you do this mapping yourself? If so, you should provide details about how you did that.

ADD REPLY • link 7.0 years ago by GenoMax 153k

0

Entering edit mode

Bam file I used are prepared according to this pipeline https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/ On bam file I used samtools and bcftools.

samtools mpileup  -Q 15 --max-idepth 8000 -vf GRCh38.d1.vd1.fasta file.bam | bcftools annotate -e "DP<7" | bcftools call -m -V indels | bgzip -c > file.vcf.gz

ADD REPLY • link 7.0 years ago by salvatore.digiorgio ▴ 10