Detecting a small set of given Intron-Exon junctions in a RNA-Seq dataset
3
1
Entering edit mode
8.6 years ago

Hi all, I've been given a set of RNA-Seq fastqs (4 samples) and a collaborator would like to know how many defined intron-exon junction in his favorite gene are detected in those reads.

My idea was to clean the reads with cutadapt, align with Bowtie and create a custom program to loop over the cigar string and detecting the junctions.

or is there an existing tool for this ? Bonus if the tool can tell me wether two junctions belong to the same transcript.

rna-seq exon intron junction • 3.2k views
ADD COMMENT
2
Entering edit mode
8.6 years ago
igor 13k

If you use the STAR aligner, it will output SJ.out.tab with high confidence collapsed splice junctions in tab-delimited format.

ADD COMMENT
0
Entering edit mode

that was simple and it workded fine, thanks

ADD REPLY
3
Entering edit mode
8.6 years ago

You can use the tool RegTools to get this information from a BAM file. So you would align your reads to the reference genome first using TopHat2, STAR, HISAT2, etc. Then use regtools to both extract and annotate the exon-exon junctions (i.e. exon-intron ... intron-exon). The

Code is here: https://github.com/griffithlab/regtools

Documentation is here: https://regtools.readthedocs.io/en/latest/

You would start with regtools junctions extract followed by regtools junctions annotate

Usage looks like this:

regtools junctions extract [options] indexed_alignments.bam
regtools junctions annotate [options] junctions.bed ref.fa annotations.gtf

The annotate result will among other things, tell you which junctions correspond to which transcripts, what novel junctions might be observed in your RNA-seq data, how these relate to known transcripts, etc. The annotations.gtf file could be a list of known transcript annotations. For example, we often use the GTF files that you can down directly from Ensembl.

ADD COMMENT
2
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6