Entering edit mode
4.3 years ago
matt81rd
▴
10
Hi i am trying to search through a file for a specific list of words. If one of those words if found i want to add a newline underneath and add this phrase \colour = 1 (I don't want to remove the orginal word i am searching for).
An extract of the file for context and format:
> LOCUS contig_2_pilon_pilon 5558986 bp DNA linear BCT
> 16-JUN-2020 DEFINITION Escherichia coli O157:H7 strain (270078)
> ACCESSION VERSION KEYWORDS . SOURCE Escherichia coli 270078
> ORGANISM Escherichia coli 270078
> Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae;
> Escherichia. COMMENT Annotated using prokka 1.14.6 from
> https://github.com/tseemann/prokka. FEATURES Location/Qualifiers
> source 1..5558986
> /organism="Escherichia coli 270078"
> /mol_type="genomic DNA"
> /strain="strain"
> /db_xref="taxon:562"
> CDS 61523..61744
> /gene="pspD"
> /locus_tag="JCCJNNLA_00057"
> /inference="ab initio prediction:Prodigal:002006"
> /inference="similar to AA sequence:RefSeq:EG10779-MONOMER"
> /codon_start=1
> /transl_table=11
> /product="peripheral inner membrane heat-shock protein"
> /translation="MNTRWQQAGQKVKPGFKLAGKLVLLTALRYGPAGVAGWAIKSVA
> RRPLKMLLAVALEPLLSRAANKLAQRYKR"
Here is one of the lists of words i am looking for throughout the file:
regulation_list=["anti-repressor","anti-termination","antirepressor","antitermination","antiterminator","anti-terminator","cold-shock","cold shock","heat-shock","heat shock","regulation","regulator","regulatory","helicase","antibiotic resistance","repressor","zinc","sensor","dipeptidase","deacetylase","5-dehydrogenase","glucosamine kinase","glucosamine-kinase","dna-binding","dna binding","methylase","sulfurtransferase","acetyltransferase","control","ATP-binding","ATP binding","Cro","Ren protein","CII","inhibitor","activator","derepression","protein Sxy","sensing","sensor","Tir chaperone","Tir-cytoskeleton","Tir cytoskeleton","Tir protein","EspD"]
As you can see that extract contains one of th ephrases i am looking for and i want to add a newline underneath with the phrase /colour = 1
Any help would be great!
if there are not too many you need to process you can open those kind of file(s) in a genome browser (apollo, artemis GenomeView, ... ) and change the color of the feature using the browser. afterwards you can then save the file again.