Dear all,
I meet two problems when deal with the Trinity assembled fasta file... After trinity assembly, I got an fasta file, but in order to get the unigene, do I need to use "cd-hit" or "get_longest_isoform_seq_per_trinity_gene.pl" to deal with this fasta file? I am not sure which one should I use, or if I need to use both of them? I searched the internet, the priciniple of this two method seems not big difference...
Another question is that I tried using "get_longest_isoform_seq_per_trinity_gene.pl" to deal with the fasta file, the output sequence has a pretty long ID number (after">"), like the below: Do we have any script or software can help deal with this and only reserve " >TRINITY_DN123217_c2_g1_i6" ? Thanks in advance for any of your suggestions!
>TRINITY_DN123217_c2_g1_i6 len=17269 path=[17247:0-35 17260:36-164 17389:165-272 17497:273-768 17993:769-968 41567:969-992 18217:993-1157 18382:1158-1163 18388:1164-1172 41570:1173-1196 18421:1197-2360 19585:2361-2423 19648:2424-2428 19653:2429-3104 20329:3105-3164 20389:3165-3318 20543:3319-3334 20559:3335-3412 20637:3413-4064 21289:4065-4999 41566:5000-5023 22248:5024-5562 41571:5563-5586 22811:5587-6002 23227:6003-6249 23474:6250-6266 23491:6267-6587 23812:6588-6730 23955:6731-7264 24489:7265-7270 24495:7271-7453 24678:7454-8207 25432:8208-8318 25543:8319-8353 25578:8354-8449 25674:8450-8695 25920:8696-9492 26717:9493-9509 41568:9510-9533 26758:9534-9614 26839:9615-10127 27352:10128-10255 27480:10256-10577 27802:10578-10599 27824:10600-10977 41569:10978-11001 28226:11002-11010 28235:11011-11074 28299:11075-12083 29308:12084-13818 31043:13819-13911 31136:13912-14015 31240:14016-14632 31857:14633-14648 31873:14649-15575 32800:15576-15608 32833:15609-15609 32834:15610-15724 32949:15725-15752 32977:15753-16182 33407:16183-16199 33424:16200-17268] [-1, 17247, 17260, 17389, 17497, 17993, 41567, 18217, 18382, 18388, 41570, 18421, 19585, 19648, 19653, 20329, 20389, 20543, 20559, 20637, 21289, 41566, 22248, 41571, 22811, 23227, 23474, 23491, 23812, 23955, 24489, 24495, 24678, 25432, 25543, 25578, 25674, 25920, 26717, 41568, 26758, 26839, 27352, 27480, 27802, 27824, 41569, 28226, 28235, 28299, 29308, 31043, 31136, 31240, 31857, 31873, 32800, 32833, 32834, 32949, 32977, 33407, 33424, -2]
For the second question:
It worked !! Thank you!!