Question

Cuffmerge : Why It Didn'T Merge These Transcripts Into One ?

2

Entering edit mode

11.5 years ago

Nicolas Rosewick 11k

Hi,

I used STAR , cufflinks and then cuffmerge to merge every assembly into on megred assembly

STAR --genomeDir $stargenomeDir --outFilterIntronMotifs RemoveNoncanonicalUnannotated --sjdbOverhang 49--outFilterMismatchNmax 10 --readFilesIn $r1 $r2 --runThreadN $threads --readFilesCommand zcat

cufflinks -g gene.gtf aligned_reads.bam

cuffmerge -g gene.gtf -s genome.fa -p 20 assemblies.txt > cuffmerge.gtf

So I checked the merged.gtf file in IGV to compare it with the gene.gtf (a not so good annotation file..) Why all these transcript are not merged into one transcript. There are obvioulsy the same !

Did I forget something ?

Thanks in advance,

N.

enter image description here

cuffmerge transcript • 4.0k views

ADD COMMENT • link updated 7.0 years ago by Biostar 20 • written 11.5 years ago by Nicolas Rosewick 11k

0

Entering edit mode

You have only one sample?

ADD REPLY • link 11.5 years ago by Sean Davis 27k

0

Entering edit mode

No I have 13 samples.

ADD REPLY • link 11.5 years ago by Nicolas Rosewick 11k

0

Entering edit mode

you are not alone. I have exactly the same issue. cufflinks keeps "proposing" transcript models which I would merge without any hesitation. I looked a bit at alignment and it seems like it makes a decision to split models on _very few_ splicing events (i.e. single or two of these). Unfortunately, I don't know how to handle this too.

ADD REPLY • link 11.5 years ago by Pavel Senin ★ 1.9k

0

Entering edit mode

I wouldn't say they're obviously the same, there are small differences between most of them. Often, most variation between transcripts occurs at the 3' and 5' utrs. Have you looked at the actual start and stop coordinates in merged.gtf? Also sometimes you can get a lot of spurious transcripts being retained when coverage is patchy, ie. there isn't enough data to support merging transcripts. It also depends on what was in genes.gtf.

ADD REPLY • link 11.1 years ago by jeales ▴ 130

0

Entering edit mode

If you avoid the use of the guide GTF, I think that it may merge them together.
Another thought: It may struggle to merge transcripts that have exactly the same length and location in the genome that also have different exon usage, as is your case.
Try CuffCompare and fiddle around with the parameters relating to merge distance (bp)

ADD REPLY • link 7.0 years ago by Kevin Blighe 88k