Hi everyone,
I'm a beginner and learning to analyze the sequence by practicing. Please give me a suggestion, How can I view and manage the assembly transcriptome from Trinity as Trinity.fasta by bash?
$ less Trinity.fasta
>TRINITY_DN30765_c0_g1_i1 len=228 path=[0:0-227]
TGGCGAAGTTTAGAGCACGGTGTTATCGGTGCTAAAGCAGGTTTTATGGGTAGCATAGCTAAATCGCATAAATATATGCTACACATATGGCTTTCCATTGCCAACAACATGCTTACCACCTGCACTGGGTTCCACAGAAACTGGGTTCCACAGCTGTGTATTGGTCCAAGTGTTGT
AACATGCTTACCACCTGCACTGGGTTCCACAGAAATATGCAGTGTTATCTCTTTACATGCTTTCTGTGTATTGTGCGCGTTC
>TRINITY_DN30719_c0_g1_i1 len=202 path=[0:0-201]
CGCCGATATAAAAGATGGAGCACCCTGTATATGTATATACGTTCATGTCTTAATACAACTGTTGTTGTATACTTATATAAATACAAATCTGTTAATTCGTGGAATAGCAATTTACCACCCATGAATAAAGTGAATTGTTCTCAGTACCTTTGAAATACGTTTAAGTAATGTAATTTACATAA
>TRINITY_DN30753_c0_g1_i1 len=336 path=[0:0-335]
AAAAAATTAGCTTTATTTTTACTTTATGGTAATAGCTTTGGTGAAATATCGAAATTTTGACTTGAATTGTACCTATCAAACCATCTGAAATCGTACATTAGTACACAAGCAAATCATTTAGACTCTTTCGTCTATCTTCGGAACAAAAACTACCACTGCTTATATGTTTGGTTTTTAATGACGTGCTGGGACCATGTAATAAGGAGTT
I tried to use 'grep' to search for the transcript that specific to 'TTTGGTGAAATATCGAAA' sequence.
grep "TTTGGTGAAATATCGAAA" Trinity.fasta
AAAAAATTAGCTTTATTTTTACTTTATGGTAATAGCTTTGGTGAAATATCGAAATTTTGACTTGAATTGTACCTATCAAACCATCTGAAATCGTACATTAGTACACAAGCAAATCATTTAGACTCTTTCGTCTATCTTCGGAACAAAAACTACCACTGCTTATATGTTTGGTTTTTAATGACGTGCTGGGACCATGTAATAAGGAGTT
But, the output is without transcript ID, How can I view the transcript name and the sequence at the same time?
Or, in the case that I know the transcript ID, How can I view its sequence?
Thank you so much
try using tools like seqkit: