Toy Tophat Example, finding a single junction from synthetic data
1
0
Entering edit mode
9.1 years ago

I am trying to get a better understanding of how tophat works. To do this I have created some toy examples. I am looking at a splice junction between two exons in chr9 of hg38. Exon 1 and exon 2 are at base positions 133261900 - 133262400 and 133274600 - 133276150 respectively. Below is my synthetic data

Read 1 (R1.fastq):

@exon 2, line 04...
ATTATTAGGAAAAGGATCATAGGTCGAAGTGCGTGGCATTTTGGTTTTCC
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
@near exon 1...
GCCTGTTGCCCAGGTGAGGCTAAAATAAGGGAGCAAGTAAGCACACCCTC
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII

Read 2 (R2.fastq):

@exon 1, line 05-06...         # starts here
GGCTCCGCGTCTGGTCTCGGCCTCCGCCTTCCGCCCGGCGGCCCTGGGGC
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
@near exon 1
CGCCAGAGCTGGACGCAGGCAATAACTGTCCTTAGGACCCTGATAACTGT
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII

The gene is on the opposite strand with respect to the reference and I am trying to emulate paired end reads. The first entry of R1.fastq is part of exon 2 and first entry of R2.fastq is part of exon 1. The result of running

tophat2 -p 2 -o . --library-type fr-unstranded /path/to/hg38/Bowtie2Index/genome R1.fastq R2.fastq

yields 2 discordant pair alignments, but no junctions. Replacing the first line of R2.fastq with its reverse complement yields 2 concordant pair alignments, but no junctions.

Question : How would I modify my data set to get tophat to recognize a junction?

RNA-Seq tophat • 2.1k views
ADD COMMENT
0
Entering edit mode
9.1 years ago

Tophat (and similar aligners) identify junctions by finding individual reads that are spliced. That is, the splicing needs to occur somewhere in the middle of the read. So, just modify one of the reads to be longer and overlap both exon 1 and 2.

ADD COMMENT

Login before adding your answer.

Traffic: 1553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6