Infer direction of paired end data
1
0
Entering edit mode
6.1 years ago

I am using publicly available data and trying to infer direction of paired end data. I have aligned filtered fastq files against c. elegans reference genome using Hisat2 and converted the sam to sorted bam using samtools. After that I have used two approaches, In the first approach I have used infer_experiment.py from RSeQC package and got this output:

This is PairEnd Data
Fraction of reads failed to determine: 0.0659
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0055
Fraction of reads explained by "1+-,1-+,2++,2--": 0.9285

It shows that RNA-Seq data is strand specific and strandness of read1 is opposite with that of gene model, while strandness of read2 is consistent to the strand of reference gene model.

I have used these threads to get a detailed insight of output :

http://rseqc.sourceforge.net/#infer-experiment-py

RSeQC Output from infer_experiment.py - what does it mean?

Secondly I indexed the bam file using samtools and loaded in Igv. I have used color alignments by first-of-pair strand option to infer strandness and got this output: IGV_output

I have got both red and blue color and this link (http://software.broadinstitute.org/software/igv/book/export/html/6) shows that: For a given transcript, non-directional libraries will show a mix of red and blue reads aligning to the locus.

The results of infer_experiment and IGV seems contradictory to me, as infer_experiment shows that it is reverse stranded and IGV shows that it is non-directional.

Can someone please clarify these things in more detail?

Many thanks in advance.

RNA-Seq • 2.3k views
ADD COMMENT
0
Entering edit mode

See How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

ADD REPLY
0
Entering edit mode
6.1 years ago

The best way to understand your data is to separate your BAM file into two, one that contains only reads from file 1 the other only reads from file 2 (basically you are breaking the pairs).

 samtools view -fb 64 all.bam > read1.bam
 samtools view -Fb 64 all.bam > read2.bam

Now all reads in file read2.bam should align in the same direction as the transcript, all reads in file read1.bam will align to the strand opposite to the transcript.

ADD COMMENT

Login before adding your answer.

Traffic: 1817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6