Entering edit mode
2.4 years ago
Neel
▴
20
Hi, i want to extract DNA seq from the file which contain both protein and DNA seq on the basis of header from 100 txt files. i tried this command also but it doesn't work, Please help me if anyone know anything.
> grep -A 1 -wFf list.txt allheader > newfile2.fas
P_aeruginosa_152962.fna.txt >CP069198.1_3523 # 3844084 # 3845187 # -1 # ID=1_3523;partial=00;start_type=ATG;rbs_motif=GGA;rbs_spacer=9bp;gc_cont=0.710 CP069198.1_3523 3844084 3845187 - Perfect 690 726.087 MexJ 100.0 3003692 protein homolog model n/a n/a macrolide antibiotic; tetracycline antibiotic; disinfecting agents and antiseptics antibiotic efflux resistance-nodulation-cell division (RND) antibiotic efflux pump ATGTACCGCCATATCCCGCTCGTCGCCCTGTCCCTGTTTTCCTCCCTGTTCCTCGCCGCCTGCGGCAACGGCACGCCGCCGCCAGCCGCGGCGCGTCCGGCGATCGTCGTCCAGCCCCAGCCGGCGGGGGAGGTGAGCCAGGCCTTTCCCGGCGAGATCCGCGCCCGCCACGAGCCGGAGCTGGCCTTCCGCATCGGCGGCAAGGTCATCCGCCGGCTGGTGGAAGTCGGCGAGCGGGTAAAGAAGGACCAGCCCCTGGCCGAACTCGATCCCCAGGACGTGCGCCTGCAACTGGAGGCGGCGCGGGCCCAGGTCAGTGCCGCCGAGGCCAACTTGCAGACCGTGCGCGCCGAGTACCGGCGCTACCGCACCTTGCTCGACCGCAACCTGGTCAGCCATTCCCAGTTCGAGAACATCCAGAACAGCTACCGCGCCGGCGAGGCGCGGCTGAAGCAGATCCGCGCCGAATTCAACGTCGCCGACAACCAGGCCGGCTACGCCGTGCTGCGCTCGCCCCAGGATGGCGTGATCGCCAGCCGGCGCGTCGAGGTGGGCCAGGTGGTGGCGGCCGGACAGACGGTCTTCAGCCTGGCCGCCGACGGCGAACGCGAGGTGCTGATCGGCCTGCCGGAACACAGCTTCGAACGTTTCCGCATCGGCCAGCCGGTGTCGGTCGAACTCTGGTCGCAACGCGACAGACGCTTCGCCGGGCATATCCGCGAGCTCTCGCCCGCGGCCGATCCGCAATCGCGTACCTTCGCCGCCCGGGTGGCCTTCGACGACCGCGCGACTCCGGCCGAACTGGGCCAGAGCGCGCGGGTCTACGTCGCCGCCGCCGAGGCGGTGCCGTTATCGGTTCCCTTGTCGGCGCTGACCGCAGAGGCCGGCCAGGCGTTCGTCTGGGTGGTCGAGCCGGGCAGCTCGACCCTGCGCCGGCAGGCGGTGCGCACCGGTCCCTATGCCGAGGACCGGGTGCCGGTGCTCGAAGGCCTGAAGGCTGGCGACTGGGTGGTGGCCACCGGGGTCCAGGTGCTTCGCGAAGGGCAGCAGGTGCGTCCGATCGACCGGGCCAACCGCACGGTGAAACTGGCGGCCAAGGAGTAG MYRHIPLVALSLFSSLFLAACGNGTPPPAAARPAIVVQPQPAGEVSQAFPGEIRARHEPELAFRIGGKVIRRLVEVGERVKKDQPLAELDPQDVRLQLEAARAQVSAAEANLQTVRAEYRRYRTLLDRNLVSHSQFENIQNSYRAGEARLKQIRAEFNVADNQAGYAVLRSPQDGVIASRRVEVGQVVAAGQTVFSLAADGEREVLIGLPEHSFERFRIGQPVSVELWSQRDRRFAGHIRELSPAADPQSRTFAARVAFDDRATPAELGQSARVYVAAAEAVPLSVPLSALTAEAGQAFVWVVEPGSSTLRRQAVRTGPYAEDRVPVLEGLKAGDWVVATGVQVLREGQQVRPIDRANRTVKLAAKE MYRHIPLVALSLFSSLFLAACGNGTPPPAAARPAIVVQPQPAGEVSQAFPGEIRARHEPELAFRIGGKVIRRLVEVGERVKKDQPLAELDPQDVRLQLEAARAQVSAAEANLQTVRAEYRRYRTLLDRNLVSHSQFENIQNSYRAGEARLKQIRAEFNVADNQAGYAVLRSPQDGVIASRRVEVGQVVAAGQTVFSLAADGEREVLIGLPEHSFERFRIGQPVSVELWSQRDRRFAGHIRELSPAADPQSRTFAARVAFDDRATPAELGQSARVYVAAAEAVPLSVPLSALTAEAGQAFVWVVEPGSSTLRRQAVRTGPYAEDRVPVLEGLKAGDWVVATGVQVLREGQQVRPIDRANRTVKLAAKE 100.00 gnl|BL_ORD_ID|2017|hsp_num:0 2205
enter code here
Thank you!
can you post an example input from one text file? Please don't post image of the data.
Hi, i didn't able to see any option regarding uploading a input file.
Appears to be a duplicate of How to extract ORF_ID(LR590472.1_162 ) from large number of txt file
@Neel if your file is tab separated then you should be able to use
awk
to extract columns you need. You can put a sample of the file at pastebin.com and provide a link.Hi, here is link of input file on pastebin.com https://pastebin.com/FPmVS9ti
Thank you
I posted an answer in your last thread. Check that.