Entering edit mode
8.6 years ago
onspotproductions
▴
150
I created an assembly using trinity which contains tr# identifiers for each sequence. When I run the fasta file through getORF it generates the ORFs but removes the identifiers. Is there an option for EMBOSS to keep the trinity identifiers. Sample data shown below
>TR4|c0_g1_i1 len=258 path=[236:0-257] [-1, 236, -2]
GTCTGCATTCAGTAGAATTAGAAAACATCAGCCCTGTTTTCCCGATGTTTGATGAACATT
GTTTGGACCATATTCAATATATACATAGGCTGTAGCTTGCCTACATAGTTTTGTTGTCAC
ACTGGGCTATGACCATGAGCTTTCACATCTGACCTTGGACTTCACCCAGGTTTAGGGTCT
CCCCTACCCAGTCGTAGGTCTATCAAGCTTCAAACACGATATGTACATACAGGCAAAGAT
GAACACAAAAGGCTTTAG
>TR5|c0_g1_i1 len=266 path=[244:0-265] [-1, 244, -2]
AAAAAATCGAACGATCACTGCACTTCTCCACTTCTCTCCCTCACCCCCATTCACATACAC
AGGCATCATGGCTCCATCATTTGTCTCGGCTGGTCTTTCCAAGTAAACCTGACCACAGCT
GTGTTGGGCGGACATCCAACTCCTAAGACATGGTTTGGGAGTGATGATGGTTATTGGGTG
TGGCTCAACAAGTACAGACATAATGGGCGGATGCTGAACAGTTGAGAGAGTGGTCGGGAC
AGGTGATGAATATTGGGAGTGGCTGG
Have you tried substituting spaces with an "_" (and replacing |, : and brackets with a _ too)? That would covert the ID into a single string and force EMBOSS to keep it intact.