lncRNA GFF file however we have encountered several difficulties. Specifically, there are many lines for what appears to be the same coding sequence, for instance the first nine rows all start at position 61723. This means that htseq_count does not know which of these 9 rows to match a particular read with. Furthermore, in the group column each entry starts with ID=STRG... rather than the traditional gene_id=... which also confounds our approach by making the htseq_count unable to recognise which lines in the GFF file are all actually just one feature.
How do I circumvent these problems - are there other tools on galaxy I should use first to clean the GFF file (see image), or do I need to use special settings or some other trick? Let me know if there is any other information you need or if I should share my galaxy history with you to clarify things.