Hello
I have a question regarding RNA-Seq. I am fairly new to this, so maybe I will phrase somethings wrong. I apologize beforehand.
I have two files that contain paired end reads. I also have a reference transcriptome (human). I used RapMap _(Srivastava et. al)_ to map/align the reads back onto the reference transcriptome. The result is one large sam file.
I want to generate the apparent fragments that were sequenced. A fragment is defined as the part of mNRA (cDNA) that got sequenced. So one transcript (mRNA/cDNA) can give rise to many fragments.
Is there a way the generate the apparent fragments? I think I have enough information because of the paired end reads, but I have no clue how to acces the relevant information.
Thanks in advance.
This is about the best illustration there is to depict fragments/inserts. So you want to recreate the fragments that got sequenced by pulling out the sequence from reference (part labeled as
insert size
in that graphic)? This is a bit of an odd request. Is there a specific purpose for it?I want to perform same data analysis of RNA Seq data. I thought it should be easy to extract the predicted fragments based on paired end reads, but I find it rather difficult.
There is no need to extract the sequence of the fragments. Feature counting programs (featureCounts or htseq-count) take into account the entire fragment when they generate the counts for the genes.