Simulating mRNA-seq data using VG - theoretical question
0
0
Entering edit mode
3 months ago
AshleeThomson ▴ 110

Hi everyone,

I'm attempting to simulate mRNA-seq data to map back to my spliced genome graph for a baseline comparison (graph vs linear reference). To summarise, I'm using an actual mRNA-seq data set to match error profiling, RSEM to calculate expression, before using vg sim to simulate the data. With all the in-between steps and files, it's getting to be a pain and I'm running into a lot of issues. But I had an idea and wanted to put it out there for some feedback.

Currently, my graph represents the full genome (introns, exons, etc.) to which I have added splice junctions using vg rna.

IN THEORY, if I were to run the vg rna step again but remove any non-gene regions (-d, --remove-non-gene), which results in an exon-only graph, and ran vg sim using this version of the graph, would this produce mRNA-like data?

I may be completely off the mark but there's no harm in asking.

mrna-seq vg sim • 242 views
ADD COMMENT
1
Entering edit mode

Yes, I believe this should be equivalent to using the full graph. It should probably be mentioned that there are still some features of real RNA-seq data that aren't modeled by vg sim, like intron retention in nascent mRNA, stochastic transcription of non-genic sequences, and expression of many ncRNAs. However, these limitations apply to both the full and exon-only graphs.

ADD REPLY

Login before adding your answer.

Traffic: 1847 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6