Entering edit mode
2.4 years ago
Neel
▴
20
Hi, i want to extract ORF_ID (for example lets suppose this is ORF_ID LR590472.1_162 for one seq/ gene, similarly i want to extract for all gene with according to their file name) of all the sequence as well as DNA seq from txt file which also have protein seq.
ORF_ID Contig Start Stop Orientation Cut_Off Pass_Bitscore Best_Hit_Bitscore Best_Hit_ARO Best_Identities ARO Model_type SNPs_in_Best_Hit_ARO Other_SNPs Drug Class
Resistance Mechanism AMR Gene Family Predicted_DNA Predicted_Protein CARD_Protein_Sequence Percentage Length of Reference Sequence ID Model_ID Nudged Note
**LR590472.1_162** # 174925 # 176028 # 1 # ID=1_162;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.708 LR590472.1_162 174925 176028 + Strict 650 726.472 TriA 99.73 3003679 protein homolog model n/a n/a disinfecting agents and antiseptics antibiotic efflux resistance-nodulation-cell division (RND) antibiotic efflux pump ATGGCGCTGCCCGCCATCCTGTGCGCCGGCCTGCTTGTCGGTTGCGGCGCCGAGCCGCCCGCCGAGGAACACGTCCGTGTGCTGGCGCAGACGGTGAAGATGGCCGAGTTCGCCTCGGCCACCTCGATCACCGGCGACATCCAGGCACGGGTACAGGCCGACCAGTCGTTCCGTGTCGGCGGCAAGATCGTCGAGCGCCTGGTCGATGTCGGCGACCACGTCGCGGCTGGCCAGGTGCTGGCGCGGCTCGACCCGCAGGACCAGCGCAGCAACGTGGAGAACGCCCAGGCGGCGGTCGCCGCGCAGCAGGCGCAGTCGAAGCTCGCCGACCTCAACTACCAGCGGCAGAAGGCGCTGCTGCCCAAGGGCTACACCAGCCAGAGCGAGTACGACCAGGCGCTGGCCTCGGTGCGCAGCGCGCAGAGTTCGCTGAAGGCCGCCCAGGCGCAGTTGGCCAACGCCCGCGACCTGCTTTCCTATACCGAGCTGCGTGCCTCCGACGCCGGGGTCATCACTGCCCGCCAGGCCGAGGTCGGCCAGGTGGTGCAGGCCACCGTGCCGATCTTCACCCTGGCCCGCGACGGCGAGCGCGACGCGGTGTTCAACGTCTACGAGTCGTTGTTCAGCCACGATGTCGACGGCCAGCGGATCACCGTCAGCCTGCTCGGCAAGCCGGAAGTCACCGCCAGCGGCAAGGTCCGCGAGATCACCCCGACGGTGGACGAGCGCAGCGGTACGCTGAAGGTCAAGGTCGGCCTAGACTCGGTGCCGGCGGAAATGAGCCTCGGCAGCGTGGTCAACGCCAGCGTCGCCGCGCCGGCCGCGCACAGCGTGGTGCTGCCCTGGTCGGCGCTGTCCAAGGTCGGCGAGCAGCCGGCGGTCTGGTTGCTCGACCAGCAAGGCAAGGCGCGTCTGCAACCGGTGCGGGTGGCACGCTACGCCAGCGAGAAGGTGGTCATCGACGGTGGCCTGGAGGCGGGCCAGACGGTGGTCACGGTGGGCGGCCAACTGCTCCATCCGGGCCAGGTGGTCGAGGTGGCCCAGCCGCCGCAGCCGACCCAGAGCACCGCCAGCCGCGACGCCGTGGGCGGAGGCCAGCCATGA MALPAILCAGLLVGCGAEPPAEEHVRVLAQTVKMAEFASATSITGDIQARVQADQSFRVGGKIVERLVDVGDHVAAGQVLARLDPQDQRSNVENAQAAVAAQQAQSKLADLNYQRQKALLPKGYTSQSEYDQALASVRSAQSSLKAAQAQLANARDLLSYTELRASDAGVITARQAEVGQVVQATVPIFTLARDGERDAVFNVYESLFSHDVDGQRITVSLLGKPEVTASGKVREITPTVDERSGTLKVKVGLDSVPAEMSLGSVVNASVAAPAAHSVVLPWSALSKVGEQPAVWLLDQQGKARLQPVRVARYASEKVVIDGGLEAGQTVVTVGGQLLHPGQVVEVAQPPQPTQSTASRDAVGGGQP MSDARGAFHSKGRWSRMALPAILCAGLLVGCGAEPPAEEHVRVLAQTVKMAEFASATSITGDIQARVQADQSFRVGGKIVERLVDVGDHVAAGQVLARLDPQDQRSNVENAQAAVAAQQAQSKLADLNYQRQKALLPKGYTSQSEYDQALASVRSAQSSLKAAQAQLANARDLLSYTELRASDAGVITARQAEVGQVVQATVPIFTLARDGERDAVFNVYESLFSHDVDGQRITVSLLGKPEVTASGKVREITPTVDERSGTLKVKVGLDSVPAEMSLGSVVNASVAAPAEHSVVLPWSALSKVGEQPAVWLLDQQGKARLQPVRVARYASEKVVIDGGLEAGQTVVTVGGQLLHPGQVVEVAQPPQPTQSTASRDAVGGGQP 95.82 gnl|BL_ORD_ID|2005|hsp_num:0 2192
**LR590472.1_163** # 176025 # 177095 # 1 # ID=1_163;partial=00;start_type=ATG;rbs_motif=GGAG;rbs_spacer=7bp;gc_cont=0.709 LR590472.1_163 176025 177095 + Strict 600 697.582 TriB 99.72 3003680 protein homolog model n/a n/a disinfecting agents and antiseptics antibiotic efflux resistance-nodulation-cell division (RND) antibiotic efflux pump ATGAAGCCGTTTTCCCTCGCCGGCCTGTTCGGCTTCGCCCTGCTCCTCTCCGGCTGCGGCGACGAGCCGCCGCCGGCACCGCCGCGGCCGGTGCTGACGGTGACCGTGAAGACCCTGAAGAACGACGACCTCGGTCGCTTCGCCGGGAGCATCCAGGCGCGCTACGAGAGCGTGCTCGGCTTCCGCACCAACGGACGGATCGCCTCGCGCCTGTTCGACGTCGGTGACTTCGTCGGCAAGGGCGCGCTGCTGGCGACCCTCGACCCCACCGACCAGCAGAACCAGTTGCGCGCCAGCCAGGGCGACCTGGCCAGCGCCGAGGCACAGTTGATCGACGCCCAGGCCAATGCCCGGCGCCAGGAAGAACTGTTCGCCCGCAGCGTCACCGCCCAGGCGCGCCTGGACGATGCGCGGACCCGCCTGAAGACCAGCCAGGCCAGCTTCGACCAGGCCAAAGCGACGGTGCAGCAGGCCAGGGACCAGCTTTCCTACACGCGCCTGGTGACCGATTTCGACGGCGTCATCACCACCTGGCACGCCGAGGCCGGGCAAGTGGTCAGCGCCGGCCAGGCGGTGGTCACCCTGGCCCGGCCCGAAGTGCGCGAGGCGGTCTTCGACCTGCCCACCGAGGTCGCCGAGAGCCTGCCGGCCGACGCGCGCTTCCTGGTCAGCGCCCAGCTCGACCCGCAGGCCAGGACCACCGGCAGCATCCGCGAGCTGGGTCCGCAGGCCGACGCCTCGACCCGCACCCGTCGCGTGCGCCTGAGCCTGGCGCAGACGCCGGAGGCGTTTCGCCTCGGTTCGACCATCCAGGTCCAGCTGAGCAGCGCCGGTAGCGTGCGCAGCGTGCTGCCGGCCAGCGTGCTGCTGGAGCGCGACGGCAAGACCCAGGTCTGGGTCGTCGATGGGAAACAGTCCAGCGTGGCCCTGCGCGAGGTACAGGTGCTCAGCCGCGACGAACGCCAGGTGGTGATCGGACAGGGCCTGGCCGACGGCGACCGGGTGGTCCGCGCCGGAGTCAACAGCCTCAAGCCCGGCCAGAAGATCAAACTCGACGAGGATGCGCGATGA MKPFSLAGLFGFALLLSGCGDEPPPAPPRPVLTVTVKTLKNDDLGRFAGSIQARYESVLGFRTNGRIASRLFDVGDFVGKGALLATLDPTDQQNQLRASQGDLASAEAQLIDAQANARRQEELFARSVTAQARLDDARTRLKTSQASFDQAKATVQQARDQLSYTRLVTDFDGVITTWHAEAGQVVSAGQAVVTLARPEVREAVFDLPTEVAESLPADARFLVSAQLDPQARTTGSIRELGPQADASTRTRRVRLSLAQTPEAFRLGSTIQVQLSSAGSVRSVLPASVLLERDGKTQVWVVDGKQSSVALREVQVLSRDERQVVIGQGLADGDRVVRAGVNSLKPGQKIKLDEDAR MKPFSLAGLFGFALLLSGCGDEPPPAPPRPVLTVTVKTLKNDDLGRFAGSIQARYESVLGFRTNGRIASRLFDVGDFVGKGALLATLDPTDQQNQLRASQGDLASAEAQLIDAQANARRQEELFARSVTAQARLDDARTRLKTSQASFDQAKAAVQQARDQLSYTRLVTDFDGVITTWHAEAGQVVSAGQAVVTLARPEVREAVFDLPTEVAESLPADARFLVSAQLDPQARTTGSIRELGPQADASTRTRRVRLSLAQTPEAFRLGSTIQVQLSSAGSVRSVLPASVLLERDGKTQVWVVDGKQSSVALREVQVLSRDERQVVIGQGLADGDRVVRAGVNSLKPGQKIKLDEDAR 100.00 gnl|BL_ORD_ID|2006|hsp_num:0 2193
Thank you!
Thank you so much but it didn't separate all DNA and protein seq. Please have a look the the file which i shared on pastebin.com. Basically i want to separate each and every DNA seq of gene from one file as well as other 200 txt file also.