I'm very sorry if this topic has been posted before, I couldn't find anything relevant. I'm doing assembly with StringTie --merge using reference genome and the list of gtf files from my samples:
stringtie --merge -G ref.gtf -o merged.gtf assembly_gtf_list.txt
And for many known genes next to each other it creates the same MSTRG ID for some reason. For example,
MSTRG.10092 ENSG00000135722 (FLBX8 gene)
MSTRG.10092 ENSG00000265690 (AC074143.2 gene)
MSTRG.10092 ENSG00000102878 (HSF4 gene)
I know that Stringtie has an issue with the novel isoforms but these are all known genes, it doesn't take into account their original gene ids. I tried to use gffcompare instead stringtie --merge but it doesn't seem to fix the problem. Are there any other options I can try?
Thank you in advance!
Hi Marina, I found the same issue in my stringtie --merge output, in which the same MSTRG ID was assigned to more than one genes. Did you fix the issue? Thanks!