Question

How To Distinguish Chimeric Transcript From Fusion Gene

3

Entering edit mode

12.0 years ago

upendrakumar.devisetty ▴ 400

Can someone tell me how to distinguish chimeric transcript from fusion gene? I am very confused.

I have recently made a transcriptome assembly from Vevelt/Oases pipeline and blasted it against Ref genome and i found that some of the transcripts were blasted on two different chromosomes and i also found that some of the transcripts blasted on the same chromosome but at different locations. Now i am wondering if i need to keep these chimeric transcripts in my final assembly or throw them away?

Thanks Upendra

fusion gene • 6.1k views

ADD COMMENT • link updated 12.0 years ago by cdsouthan ★ 1.9k • written 12.0 years ago by upendrakumar.devisetty ▴ 400

0

Entering edit mode

Can you clarify what you mean by a "fusion gene"? Searching for chimeras is fairly straightforward. There could be many different reasons why your reads are hitting different locations.

ADD REPLY • link 12.0 years ago by Josh Herr 5.8k

1

Entering edit mode

By fusion gene i mean transcripts originating from two different parts of the genes either by translocation, deletion etc., This is biological. What i am worried is if i throw away the chimera transcripts i would be throwing away biologically interesting genes. On the other hand if i keep those chimera transcripts i am worried that it will be cause major artefacts in analysis following a transcriptome assembly, like detection of sequence or expression variation.

ADD REPLY • link 12.0 years ago by upendrakumar.devisetty ▴ 400

0

Entering edit mode

Probably splitting the problem in two and solving each one separately would bring the best results!

1) Use first Velvet/Oases and throw away those transcripts which map on two different chromosomes (most likely due to assembly errors there are many false positives fusion genes)

2) For finding fusion genes use specificaly design tools like for example FusionCatcher http://code.google.com/p/fusioncatcher/ (it has very good sensivity and specificity for finding fusion genes)

3) use the results from step 1 and step 2 together

ADD REPLY • link 12.0 years ago by Enx ▴ 30

0

Entering edit mode

This is common to de novo assembly. For a large genome, most of chimeric contigs are caused by misassembly (including misassembly in the reference genome) instead of anything biological.

ADD REPLY • link 12.0 years ago by lh3 33k

0

Entering edit mode

I agree with you regarding reference genome. How about denovo assembly? How about transcriptome assembly? Keep or not keep?

ADD REPLY • link 12.0 years ago by upendrakumar.devisetty ▴ 400

score 2 · Answer 1 · 2012-11-27

2

Entering edit mode

12.0 years ago

cdsouthan ★ 1.9k

As Josh says, in terms of causes of chimeras as phenomena they could be technical (e.g. library preparation) and or biological reasons, particularly associated with rare "events" (see http://www.ncbi.nlm.nih.gov/pubmed/11840564). For example if one animal used for the cDNA library has a chromsomal abnormality a fusion gene is exactly what you might get and a library from a cancer cell line could give you loads of them. Nominally if you want a reference set of contiged transcripits its your choice what to reject but remember the misfits may be rare but biologically real.

ADD COMMENT • link 12.0 years ago by cdsouthan ★ 1.9k

0

Entering edit mode

Thanks. You think i should keep them ?

ADD REPLY • link 12.0 years ago by upendrakumar.devisetty ▴ 400

0

Entering edit mode

As ever its your call. If the experiment with your collaborators produced a good and unique cDNA library (what organism/tissue BTW?) with deep and high quality reads then its worth working hard at contiging the transcripts and submitting to TSA. You are in a double-bind as pointed out because chimeras by any cause will confound transcript contiging and of course any draft genomic assembly will have its own assembly artefacts (e.g. split genes) that confound the mapping of some transcripts anyway. Just keep the misfits transcript file for a rainy day... Alternatively you could experiment with coverage thresholds, ie a putative chimeric transcript supported by many reads is more likely to be "interesting"

ADD REPLY • link 12.0 years ago by cdsouthan ★ 1.9k

0

Entering edit mode

Sounds good cdsouthan. My denovo transcriptome assembly reads came from deep sequencing of a plant (Brassica rapa) from very many tissues (9 different in total). I could easily look to see the coverage for those putative chimeric transcripts and then i will decide if i want to keep them or not. Thanks anyway for your suggestion.

ADD REPLY • link 12.0 years ago by upendrakumar.devisetty ▴ 400