Getting proteins from genomic interval of a genbank file
2
0
Entering edit mode
6.0 years ago
fhsantanna ▴ 620

I am trying to get all proteins from a genbank file considering a sequence interval. Here is the code of the script written in python:

#!/usr/bin/env python

#Script for getting proteins in a range from a genbank file
#Usage: python get_proteins.py file start:end

import sys
from Bio import SeqIO

rec = SeqIO.read(sys.argv[1], 'genbank')
feats = [feat for feat in rec.features if feat.type == "CDS"]
start, end = sys.argv[2].split(':')

desired = set(range(int(start),int(end),1))

for f in feats:
    span = set(range(f.location._start.position, f.location._end.position))
    if span & desired:
        print("%s,%s,%d:%d\n%s\n" % (
                    f.qualifiers['locus_tag'][0],
                    f.qualifiers['product'][0],
                    f.location._start.position,
                    f.location._end.position,
                    f.qualifiers["translation"][0]))

There are two problems: For some regions containing pseudogenes, printing "f.qualifiers["translation"][0]" breaks the script;

$ python get_proteins.py ../gbk/NZ_AOLM01000025.1.gbk 92686:113021 > island_proteins.faa
Traceback (most recent call last):
  File "get_proteins.py", line 24, in <module>
    f.qualifiers["translation"][0]))
KeyError: 'translation'

I know that I must use "try", but I was not successful in doing it.

The second problem is that some genbank files give the following error:

$ python get_proteins.py ../gbk/NZ_CM001555.1.gbk 530979:551299 > island_proteins.faa
Traceback (most recent call last): File "get_proteins.py", line 17, in> <module>
 span = set(range(f.location._start.position, f.location._end.position)) AttributeError: 'CompoundLocation' object
has no attribute '_start'

I did not understand this last problem.

Thank you.

genbank proteins cds biopython • 2.5k views
ADD COMMENT
0
Entering edit mode

I think I managed the problem:

#!/usr/bin/env python

#Script for getting proteins in a range from a genbank file
#Usage: python get_proteins.py file start:end


#!/usr/bin/env python

#This script is a modification of the script found in Peter Cock's site (http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/python/genbank2fasta/).
# Usage: python gbk2faa.py  start:end 

import sys
from Bio import GenBank
from Bio import SeqIO

input_handle  = open(sys.argv[1], "r")
output_handle = open(sys.argv[3], "w")

desired_interval = sys.argv[2].split(':')
start = int(desired_interval[0])
end = int(desired_interval[1])
#desired_interval = set(range(int(start),int(end),1))

for seq_record in SeqIO.parse(input_handle, "genbank") :
    print("Dealing with GenBank record %s" % seq_record.id)
    for seq_feature in seq_record[start:end].features :
        try: # Without "try", it crashes when it finds a CDS without translation (pseudogene).
            if seq_feature.type=="CDS" :
                assert len(seq_feature.qualifiers['translation'])==1
                output_handle.write(">%s,%s,%s,%s\n%s\n" % (
                    seq_feature.qualifiers['locus_tag'][0],
                    seq_feature.qualifiers['product'][0],
                    seq_record.id,
                    seq_record.description,
                    seq_feature.qualifiers['translation'][0]))
                pass
        except:
            continue

output_handle.close()
input_handle.close()
print("Done")
ADD REPLY
3
Entering edit mode
6.0 years ago
Joe 22k

I've built something around that exact code before, for this problem: https://github.com/jrjhealey/bioinfo-tools/blob/master/get_spans.py

It requires that the file be a single contiguous genbank (for now). It prints the features, but it should be easy to alter the output to get fasta sequences or similar via Biopython.

ADD COMMENT
1
Entering edit mode
6.0 years ago

using xslt and then filter the TSV output with awk

:

wget -O - -q "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&rettype=gb&retmode=xml&id=NZ_AOLM01000025"   | xsltproc --novalid biostar378467.xsl - > out.tsv
view raw README.md hosted with ❤ by GitHub
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:text>#locus_tag product from to translation
</xsl:text>
<xsl:apply-templates select="GBSet/GBSeq/GBSeq_feature-table/GBFeature[GBFeature_quals/GBQualifier/GBQualifier_name/text()='translation']"/>
</xsl:template>
<xsl:template match="GBFeature">
<xsl:value-of select="GBFeature_quals/GBQualifier[GBQualifier_name/text()='locus_tag']/GBQualifier_value/text()"/>
<xsl:text> </xsl:text>
<xsl:value-of select="GBFeature_quals/GBQualifier[GBQualifier_name/text()='product']/GBQualifier_value/text()"/>
<xsl:text> </xsl:text>
<xsl:value-of select="GBFeature_intervals/GBInterval/GBInterval_from"/>
<xsl:text> </xsl:text>
<xsl:value-of select="GBFeature_intervals/GBInterval/GBInterval_to"/>
<xsl:text> </xsl:text>
<xsl:value-of select="GBFeature_quals/GBQualifier[GBQualifier_name/text()='translation']/GBQualifier_value/text()"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
#locus_tag product from to translation
C441_RS14675 ZIP family metal transporter 1209 400 MTAAEELFVSLAGSDPVMQGLLGGVVIAGMNMLGALLILVWRDPSKRSLDTLLGFAAGVMLAASFTSLILPGIEAAGGNPIPVLAGFVIGVVVLDQADLWIPHVHILVTGKTRADAPETDKKMASVILFIVAITIHNMPEGLAVGVGFGSGDLGTAIPLMLAIGIQNIPEGLAVSIAAVNAGLRNTTYATFAGIRAGLVEIPLAVFGAWAVQYAAALLPYAMGFAAGAMLFVISDEIVPETHSNGHERVATFGTMLGVVVMLYLDVTLG
C441_RS14680 hypothetical protein 1746 1318 MDVRDAVDTDAEALAAVADAPIRAMRRLIRDRTVRLATAETGSDPHADAPADDEPVLGFVGFDVRDGVLHVTRLGGTETAVRRLLEEPLRFASAEDLPVEVLVLESETPLCDAVEHAGFDRVGHGPRFEGSPTVRYVLDETG
C441_RS14685 aspartate kinase 1934 3112 MRVVAKFGGTSLGSGDRINRAADSIAAAVEHGHEIAVVASAMGSTTDDLLDEIKFEADDRDRAEIVSMGERTSVRMLKAALAARGVNALFVEPGTDEWPVITNDLGEVDVEATRERAAKLAAELDGVVPVITGFLAQNHDGEITTLGRGGSDTSAVMLGNYMDADEVVIVTDVEGVMTGDPRVVEGARNVGRITVDELRNLSFRGAEVVAPSALSYKDAALDVRVVHYQHGDLLTGGTLIEGEFHNLIDMQEEPLACLTVAGRAIRNRPGILADLSAALREEDINVDSVASGMDSITFYVLEDDSDRAEAVLHDRVIADDALSSVTVEDDIAVVRVTGGELPNRPGVILDIVQPLSDAGINIHDAITSATSVAIFVAWDDREETLGIIQNEF
C441_RS14690 tryptophanase 3348 4694 MKSYKAKMVEPIELPSREEREAALERAGYNAFNLDARDVYIDLLTDSGTGTMSAEQWAAMIRGDEAYAGSESFARLADSVRDVMGFEHVVPTHQGRGAENVLYGVLLEDGDVVPNNSHFDTTRAHVVNQGAEPVDCPSPASRDPNSTETFKGNFDIDAGYALVEEVGADAIPVVVLTITNNSVAGQPVSMANIRATAEFARDIDAMFVIDACRFAENAHFIKTNEAGYENHSVAEIARAQFEHADAITMSGKKDALVNIGGFAAMRDETVFEHAKQRAILYEGFPTYGGLSGRDIEAMAVGLREAVTPPYVTDRVEQVAELGDLLVEAGVPVYQPTGGHAVYLDAGEVFPHVPKDEYPGQQLVCALYLEGGVRGVELGGFAFPGTDRPDLVRLALPRRTYSREHLEHVAETAAKVMASAGEYGGLEIVEEPPMKELRHFSARLEPVSN
C441_RS14695 metallophosphoesterase family protein 5384 4725 MRLGVISDVHGNLPALEAVLESMPPVDRLVCAGDVVGYNPWPEECVSELAERDVPTVSGNHDRAVTSGTGFRFNSMAAAGVDYARDALSPASMEWLSALPNTRHVADERVKLAHGHPSDPDRYTYPDDFSPSLLDDEDLLVLGHTHVQGHRIYDDGIVLNPGSVGQPRDGDPRAAYAVADLDAMEIEAERVNYDIDRVAERVREVGLPERLASRLYEGK
C441_RS14700 IMP cyclohydrolase 6005 5433 MYVGRFVVVGPGIGAYRVSSRSFPNRQVVDRDDALTVGPTPDAPETDNPYIAYNCVREGGDYVVVGNGSQVDPVAEKLDLGYPPRDALAEILLALDYEKDDYDTPRIAGVVGDDEAFIGIVRRDALVVEAVTEPTLVATYEEDDPRAFDLDAETAADAARELYDHEFEHAVCAAAATVGDAVETAFYNGE
C441_RS14710 KEOPS complex Cgi121-like subunit 6866 6372 MRLVEAEATVANLDSFIAAVGEIADETGATVQAFDARYVVDREHLERATELADRAIARGNEIARDRAVEILLYASGRRQINRAFEIGVSEGTLPVVILVDGGDEAEAEAALFDRLDLEPAETLGDYDEALVRDVFDVGEAELRVADGDLPALVRERVALLAVDR
C441_RS14715 ATP-dependent DNA helicase 9352 6863 MRTADLTGLPTGIPEALRDEGIEELYPPQAEAVEAGLTDGESLVAAVPTASGKTLVAELAMLSSVARGGKALYIVPLRALASEKKAEFERWEEYGIDVGVSTGNYESDGEWLSSRDIIVATSEKVDSLVRNNAAWMDQLTCVVADEVHLVDDRHRGPTLEVTLAKLRRLNPNLQVVALSATVGNAGVVADWLDAKLVESDWRPIDLKMGVHYGNAVSFADGSQREVPVGRGERQTPALVADALEGDGEGDQGSSLVFVNSRRNAESAARRMAGVTERYLTGDERSDLAELAAEIRDVSDTETSEDLAKAVAKGAAFHHAGLAAEHRTLVEDAFRDRLIKCICATPTLAAGVNTPSRRVVVRDWQRYDGDYGGMKPLDVLEVHQMMGRAGRPGLDPYGEAVLLAKDADARDELFERYIWADPEDVRSKLAAEPALRTHLLATVASGFAHTREGLLEFLDQTLYATQTDDPERLGQVTDRVLDYLEVNGFVEFEGETIRATPVGHTVSRLYLDPMSAAEIIDGLEWAADHRTEKLRALAGETPEKPERDRSDSDDEPGGFQRASEMVADDGGGGAGDDGDGADDTDGGGGDADAFETDRTYPTPLGLYHLVCRTPDMYRLYLKSGDRETYTELCYEREPEFLGRVPSEYEDVAFEDWLSALKTAKLLEDWVGEVDEDRITERYGVGPGDIRGKVETAEWLLGAAERLASELDLDSVYAVREAKKRVEYGVREELLDLAGVRGVGRKRARRLFEAGVESRADLREAEKSRILAALRGRRKTAQNILEAAGRKNPSMDEVDEDDAPDDAVPEDAGFETAKERADQQASLGDFE
C441_RS14720 ferredoxin 9517 9762 MKVEYDRDTCIGMFQCVDEWEGFEKNVDDGKADLVDAEETDDGVFVREVPEDAEFDAKFAARVCPVEAIRILDDDGEQLVP
C441_RS14725 hypothetical protein 10014 10859 MFETRTLPSDLESVRDEYAPGALVLDVAGDFDTVPPEAAENLGLVVESLSPAAYPAEWLPDDAPQQLRRYASSDFTIGMPGDGTVTWSRQTTPPVVLVKYRAKGTPDDFLDFLVAEAFVQAGTDEIPEQFLPFFGERYRDLAAATPLGPSETYQVAAALYEGWVGLHTREAFASWEGRYDRLHEAWVDAGGRLDDRLENLPRLVALGRLSFAEATEFACSAVKHERDLPAPFSALDTAAYRDHGPSYAVKWAEKTFEQLAADDDAESGGDADPTADDADSA
C441_RS14730 hypothetical protein 11459 10881 MDYELAIEDAPETIPGGTGILLLHPSIGETDRIDTDFFKVDTDHFLVISTRTTAREVEQKLEHYEVDESSATILDTLSIERGYSRRSSDNIHYVASPDDLDAIVEKTRQFLEGHDGKLRLSVDSVTEMAYYADEDGAFEATKQILELLDEFDAVGLFHLSKEVHDQETLDRFRELFDGVVDLNEDGTVTTEF
C441_RS14735 DUF2391 family protein 12011 11553 MVGVRRRFALADTAQQVVGGFLLAGPFVVTEEVWVLARSMSFAQALLTLCIVLAVGYGALYKADDRDPDREREVGGIPVRFISLISVSYLSVFILALAFDAPGTFLSDVSGQSLVTVLGYEVDLAVLRITLKATSVGAVFSVIGAATADSLF
C441_RS14740 class I SAM-dependent methyltransferase 12138 12818 MSVRDEFDAWAADGRDKGMEDRHWHTAKHALARMPVEEGDTVVDLGTGSGYALRALRDTKGIGRGFGLDGSPEMVQNARAYTDTDDLSFLVGDFDDLPFEDDSVDHVWSMEAFYYAADPHHTLEEVARVLRPGGTFYCAVNYYEENVHSHEWQDLISIDMTRWSHAEYREAFRDAGLHVAEQDSIADLDIDIPAAAEFPTDDWETREAMVERYRTFGTLLTVGVAP
C441_RS14745 ABC transporter ATP-binding protein 13985 12849 MRLELDAVSKRYGSATALDSVSLSVDDGEFFTLVGPSGCGKTTTLRCIAGFEAPTEGAVRFDGESMAGVAPESRGVGVVFQNYALFPHLSVGENVAYGLRFTDPPGGGSRDDRVAELLDLVDLAGFEDRDPDSLSGGQQQRVALARALAPGPDLLLLDEPMSALDARLRDRLRRQVKHIQSELGVTTVYVTHDQSEALAVSDRVAVLNRGRVEQVGDPRELYHRPRTRFVAEFLGENNVLDAVVESRPETGGIRVGVGDAVFTLAEGRRVLGRAGAVSGERAENDAKDGSRPARPEPGDELTFCVRPEQLRVGAGTNQIRGTVVDTEFQGATTRVRLDWGETELVVAVDDGGEQSDAGFEIGTTMEVGFDPAAAHIVE
C441_RS14750 iron ABC transporter permease 15822 14101 MSRVSASVRRAARALEQRLLTLVAALTAVVLLVLFYYPVATVFADAVLADGRLTVEPIAAVLTSEFYLVDIIWFTAKQAFYSTLASLALGLPAAWLFARFEFRGRETLRSLTILPFVMPSIMVAIGFVATFGRNGTLNRALSLVGLPPVELLFTLEAIIVAHAFYNAPLVARVVTAAWESVDARTVETARSLGAGPRRAFRDVVLPQLLPSIGVGATLTFIFTFASFPIVLALGGFQLATIEVFVYSRVRDLAYAEAASLAVVETAVSITLTAVYLRYEASQRSAGGAANPLPRRSVLPSDWTPKSTLRTVGIAGYALVVGLVFVVPIASMVLASVTGGDGGFTLANYAFLAERQATGASFQVKPLPAILNSLGFAAGTLVVAVPMGVTMAVLTTRRYRGRGLIDVLSMAPFAVSGIVVGLGLLRGLVFGVDVLGTRIRVTGALAIVAAHAVGAYPFVTRNVAPLFARLDGRLVESARSLGATRTRALVDIELPLVWTGVVAGAAFAVAISIGEFDSTIILAEGAGSYTMPVAVERFLGRRLGPATAMGCLLLLVTSASFLVVDRFGGRWGEL
C441_RS14755 thiamine ABC transporter substrate-binding protein 16974 15832 MRRRSFLKAAGAGGVSALLAGCAGTGGDEATTEASGDATETTTTTQGTTTGGESPTLTVGTYGSFVDAPSSSPGPWLKEAFESEFDATLEWQTPDSGVNYYIERALRGVESGADLYVGLDAQMLVRIDENLDDALFSPAEGLSRLGDVNEQLNFDPQGRAVPYDTGYISLVYDETYGDGDGDFVAPETFDGLLESEYAGALLAQNPTSSATGQAFLLHTIDAKGEDGYLDYWASLKENDVRVLGNWEDSYNAYSNGEAPMVVSYSTDQVYAAESGEDMARHQIRFLNDEGYANPEGMAPFADASNPDLAAEFMDFVLRPEVQAEIAVRNVQFPATTTAELPEEFAQYAQEPPEAVTFSYDRLQNNLSDWTDAWAREFASK
C441_RS14760 AI-2E family transporter 17103 18185 MTASRPYVFGGVLALFALLAAVMLANVLATVFFAITVAYLLVPLRRGLEARGASRWVASLAATVVAAVGVVVVLAPLFVILFLRLSDILELAALLPDVVTVEFLGMVETVTLDDVVAVGLGLLQSVGRAAATAAPVVLIKLTLFGFLVFALLLSGDAVGRTLLALVPADYRDAATALNERARETLFAIYVLQAATAVGTFAIGLVVFWALGYDYVVTLATVAAVLQFIPIVGPSVLLAAMAAYHVAVGDLVAAALVVAVGGFAVAWLPDILIRPRLAKETADLPGSLYFVGFVGGLLSLGPVGIIAGPLVVALLVESVDLLGAELRVDGGGGVGGGGDDGDESDGTAGTEGGDAADSLQS
C441_RS14765 sulfurtransferase 18971 18195 MVDVVSSDWLADRLGDVRVVDVRDGWEFDGIGHLPGAVSIPFDSFRSADGDVGMLPGRDAWTDLLSGAGIAADDDVVAYDDTHGVFAARFLVTALLYGHDPDRLHLLDGDFSAWNRERETTTEATEVAETVYEITEPAETPLVDFEAVEAALDDPETVTVDTRDPAEYEEGHLPGAVNVDWRELVDDETRGLKPRDELDSILEAAGVTPDRRVVLYCNTARRISHTYVVLSHLGYDDVAFYEGSLTEWEARDGAVVED
C441_RS14770 sulfurtransferase 19223 20083 MSNSDYAKDVLVSADWVESHLDEFQSDDAAYRLVEVDVDTEAYDESHAPGAIGFNWESQLQDQTTRDVLTKEDFEDLLGSHGISEDSTVVLYGDNSNWFAAYTYWQFKYYGHEDVHLMNGGRDYWVDNDYPTTDEVPSFPERDYTAKGPFEDIRAYRDDVEKAVDKGLPLVDVRSPEEFSGEILAPPGLQETAQRGGHIPGASNISWAATVNDDGTFKSADELRDLYADQGIEGDESTIAYCRIGERSSIAWFALHELLGYENVTNYDGSWTEWGNLVGAPVEKGN
C441_RS14775 short-chain fatty acid transporter 21579 20137 MTNAIRQAAERSSKLVEQYLPDAFLFAIILTGVAFVLALVSVAPGEGTGVVGHAGNLLLDGWYGGFWNLLSFGMQMTLILMTGYALAQTKPVDWLLTRLARVPNTERGAAAMVPVVAAAASFVHWGLGLVVGALFARKIATEMRGIDFPIVVAGAYSGFVVWHGGLAGSIPLLLNTEDNFLIEAGILDTTFGTGGTIFTVANLALVVAVGFLFLPALFALMYPTDETKKTPIDPAEFEAATDGGEQTTSGWASTAVPEDASLATRIEHSLGIGVAIGLVGLLAVALYFFEGVQNGTMPWNNLNLNIVNFGFLFLGLLFHGTPKAYIEAIVEAVENVWGIILQFPFYAGIMGIMAYAPEGSVSLATQIAQGMVAVAPDGTLPAFAFFTAGLVNFFVPSGGGEWAVIGETLVTAAKASGESIPRVAVAASWGDAWTNMIQPFWAIPLLSISGLSVRDIMGYCVMVLLGAGVLVAVGISVLPM
C441_RS14780 hypothetical protein 21766 22365 MTELLDTLRDDHETPLSRLGSSKALYAVTGGEMDGDAVRAAAAAEAAAAADLFDWWADDEPNDEAAALFSDLADTARDHADTVDAEPDGSKPNVYDVLAEFETTDGRLGGALARALVSLKTVEQMVGFFVGDADPMAANDFRTLKSDLNDQLDALGAAVSDLVEDDAVAREAADAVVEAAYDEYVETLEGMGVKPKNVC
C441_RS14785 hypothetical protein 22772 22503 MTRAELERASNLLKEAAEATDGDVQERLYEQSDQLAKLATREQGPDHGRLARHMTVLHDLAEDLDGDVAETVREARSEVLEYRKGVPGV
C441_RS14790 excinuclease ABC subunit UvrB 22861 24963 MSESGPLSADRPDADREFRVDAPFDPAGDQPEAIEALARGFREGADVQTLLGVTGSGKTNTVSWVVEEIQKPTLVLAHNKTLAAQLYEEFKGLFPDNAVEYFVSYYDYYQPEAYIEQTDTYIDKDMSINEEIDRLRHSATRSLLTRDDVIVVASVSAIYGLGDPKNYTDLSLRLEVGQGMDRDELLGALVDLNYERNDVDFRQGTFRVRGDTVEVFPMYGRYAVRIEFWGDEIDRMLKLDPLAGEVKSSEPAVLVHPAEHYSIPEAQLEGAISEIEELMEQRVKHFQRQGDLVAAQRIEERTTFDIEMLRETGHCSGIENYSVHLSDREPGDAPYTLLDYFPDDFLTVIDESHVTLPQIKGQYAGDKSRKDSLVENGFRLPTAYDNRPLTFEEFEETVGQALFVSATPGDYEREHSDQIVEQIVRPTHLVDPRVEVTEATGQVDDLMARIDERVDRDERVLVTTLTKRMAEDLTEYLEEAGVDVAYMHDETDTLERHELIRSLRLGDIDVLVGINLLREGLDIPEVSLVAILDADQQGFLRSETTLVQTMGRAARNVNGEVVLYADEMTDAMEAAISETQRRRRIQQAFNEEHGYTPTTIEKEVGETNLPGSKTDTRGVSGDEPADADEAVEQIAFLEDRMQEAADNLEFELAADIRDRIQTLRREFDVDALEDGVAPEYPDEHDGDGDGDGGLAPPDEF
C441_RS14795 NUDIX hydrolase 24970 25389 MSRPPTLTFAAGGLLRRDDGRLCLVHRPRYGDWSLPKGKLEPGETLVETAVREVREETRCAVDCGGFAGRYEYRVPDDAGTRSGPKGVFVWHMRVAAEHPFEPGGEVDARRWVTPAEALDRLTYETERALVRRAFELNE
C441_RS19135 hypothetical protein 25589 25783 MVRGFEHSVLDSSGDHRTPVHPVCDAVASSGDGEHARVGAAGDSANGLFSTRHSTRVHADATAY
C441_RS14800 sulfite exporter TauE/SafE family protein 25798 26880 MSSQSLSHVQKSFLKYQHILVFIAPLLFLASVLTMAPTPSGAGMEYWLQYWWLFPVFLTGATIVNTVGISGSALFVPFLIFIFPIFAHPLDSSTLVKVGLISEAFGLSSSAVAFIQYGLVDRRLALTLVGGSIPFVVGGALLSFVIPDVVFHALLGIALLAASYLLFTADLGHEDHGSSGSDHAAATDGGVSASLPNDPGKLGPAGVNTADDGTVTRVDREGDDYTYTRGGYLRRFANYSVGGMFQGLAGFGIGELGIISMLGTKVPVRVAIGTNHIVVALTAILASLVHVFGGGLVGGHSLSLATTPWNMVVFTVPATVLGGQIAPYVSNALETSVIKNFVGVLFAVISVALFLMALGI
C441_RS14805 universal stress protein 26949 27389 MYDHILLPTDGSDATDATIEHAATLAETYGATVHVLSVADSRNRFESPSAGIAPDVWEKSELERAESAADAAIEALPDGVETERTVVEGVPHSTIVDYAADGDIDLVVMATHGRTGLDHYLVGSVTERVVRQSDAPVLTVRAADEE
C441_RS14810 acyl-CoA thioesterase 27903 27409 MQQQSSVGTRSLSASRAEMSEILMPNDTNNLGRALGGRILEWMDVCGAIAGRRFAERQVVTASMDHVDFLAPIDVGDVVTVEAYVFDTGRTSMDIKVDVTAERPSEGDRRETLTSFFTFVALDDDEKPVPVPDLVCDSQEERELRDSALQKRREHRAALAEEFE
C441_RS14815 hypothetical protein 28053 28328 MSARTADRWWDADTRLSSAFGGTFPKVPTLVRPDGWGISRDPNSTHRAVSRLTSASRLAEPRRPDDGTHVGRPRRPATTLYRPRCAERVTD
C441_RS14820 SPFH/Band 7/PHB domain protein 29552 28350 MDSLFYQLGKLSASSSSSSSGAARVDSEPGRIPRWLGPLAVAVVVFGLAVVVFPVTPLTLAGYVALALAVAAVYDAVEIVQAYEKRTLTVFGDYKGILEPGLNVVPPFVSKTYRFDMRTQTLDVPSQEAITEDNSPVTADAVVYIRVMDPERAFLQVDNYRRAVSLLAQTTLRAALGDMELDDTLARRDHINARIRRELDEPTDEWGVRVESVEVREVKPSKDVENAMEQQTSAERRRRAMILEAQGKRRSAVEKAQGDKQSNIIRAQGEKQSQILEAQGDAISTVLRARAAESMGERAIIDKGMETLANIGTSPSTTYVLPQELTSLLGRYGKGLSGSDIQQAAGLESKAFDEETRELLGLDDISEILGELDGIETADISADAVDIEIEDGGVETERVE
C441_RS14825 hypothetical protein 29645 29839 MRTNRGRIDVEDLLKIILLLVLVWLVLEIIGEVLGLFGALLGPLQPLLGLVVAALIVLWLLDRI
C441_RS14830 2'-5' RNA ligase family protein 29895 30392 MYSLNVPVPGRVARLASDLFPYLASFDRVRDRHTLVCKRFEDDDLDRLREQLRHALAGQPAFEARVTDIRFFEDPPRGAAPVVYLAVESPGLLDLHRALADEFGAIEGIEGDDYVPHVTLARGGSVADARAVASQPLDPIEWTVSQLDVYDSSFRETAASISLPA
C441_RS14835 cytochrome P450 31820 30444 MSSTPPGPKGLPLFGASRQYARDPFTFLTAVADAYGDVVHFDLGPLDTYMLTNPADVETVLVSEASKFRKPQFQDRAIGDLLGDGLLMSEGATWKKQRRLAQPAFDVRRISTMAGMMTDRTASMLSSWGDGDVVDVQLEMARLTVEIIVDAMFGTDLDDERVRRVQENLEPLGARFEPDPLRFLTPDWAPTRENRQYKEALSELESLVWDIVEERRGTEYGETPASSVSAGATGEEGPMDLLSILLRAYDEGEQTEKNLRDELMTMLLAGHDTTALTLTYAWYLLSQHPEAEAKLHRELDEVLDGRTPTFEDVRELEYTERVLNEAMRLYPPVYVMFREPKVDVRLGGYRVPAGSAIMLPQWVVHRSDRWWDDPLSFDPDRWAPERTGDRPRFAYFPFGGGPRHCIGKHLSLLEGRLILGTVAQRYELDYVRDEPFSLRGSLTMHPEEPMGMRLRARD
C441_RS14840 ATP-dependent DNA helicase 34103 31899 MAHANGAPDDGDADGAWRFFPYDEPYPNQEAAMAGIADALDDERNVLLEGATGTGKTISALVPALSYAREHDKTVVITTNVHQQMRQFVEDARAITREEAIRAVVFRGKSSMCHIDVGFQECQTLRDTTRSIVEKESDKAELSEQAQSLLDGMREGESGAADARSAVTDELDALDDELEELKEGNYCEHYYNNLTRNTDQFFQWLFDDVRTPDEIFEYAGKQNLCGYELLKEGMEGIDLVVCNYHHLLDPMIREQFFRWLDRDPDDVITVFDEAHNIESAARDHASRALTENTLESAMNELEDEDDSRAESARNVIGTFLDALRDSYEEAFGFGEREQVGENWYDLSIANQGRRDDLTMEFLQSYEGRGIDVEVELALQLGKQLDEQYEDAYKNGEATTRKECQTLQAANFIADWTEMGGELGRHPMLSVRRDGGTDEIYGRAELYTCIPREVTRELFEEVHASVLMSATLRPFDVTEGTLGLENPVTMAYGLEYPEENRRTFSVSLPALFSSERDDPGTQETIEEMLADAAAFTPGNTLVFFPSYAEAERHHERLRTNPDVDAELFLDEPGVRAEELRREFVGKDGAVLLTSLWGTLAEGVSFDGDDARTVVVVGVPYPHLSERLEAVQAAYDRAYRGKRDAGWRYAVEIPTIRKTRQALGRVIRAPDDFGVRVLADKRYTRESSSMGKYGVRGSFPVEERAEMVDIAPNKLKFAMLNFYADRDAYDGDPPRP
C441_RS14845 ester cyclase 34254 34718 MDPIPTAEMKATVRHVEEAVWNGRDPAAFDELATDDFVMHDPMGDRGVDGAREMVQAVLDGTPDLEFTVDDLFAAGDRVAVRYTLEGTNEAPSYLTAEPTGNHWQSSGITIYRFEGDRIAEQWDAFDYFGTMRQLGLIPSEAETDAEGSAGAPA
C441_RS14850 ornithine carbamoyltransferase 35631 34732 MLETTHFTDIDDISASELDRVLTHAADIKSGDDETQLTRATLAMLFEKPSTRTRVSFETGMTELGGHALFLGPEDIQLGHGEPLSDTARVLGRYGDAIMARLFEHEDLLEIAEHSDVPVINGLTDDAHPCQTLADLLTIREHVGDFDEVSAAWVGDGNNVGQSFVLGCAMAGIDLTVATPPAYGVDDDVLEKADELGSAPTITTDPEEAVSDADVVYTDVWISMGQEDQRHEKLQAFEGFQLNEELLSGTDAKVMHCLPAHRGEEITGDVLEGERSLVWDQAENRLHAQKGLIVELLEE
C441_RS14855 [LysW]-lysine hydrolase 36843 35701 MNAEIEREDDEADVENAGAADAETDASAVSDDGTDALDEGEWAEARHLLYDMVSTPSVSGEEEAAAEVLKAFFEAHDREVWIDEIGNVRAPADDAVLLTSHIDTVPGDVPVKIEDGVLWGRGSVDATGPLCSMAAAAVETGVSFVGVVGEETSSRGAWHLVEDREEPDAVVNGEPSGWDGVTLGYRGFLSGTYISTSELGHSSRPEENAIQSAVAWWSRVADFFDEERDGVFDTVTTKPVRFDGGPSDDGLAVEATVDVQFRVPPRYTIDDVREVAESELTRGGVHWNKPIPPVMMSPRTDVARAFRVAIRNVGGVKPRLLRKTGTSDMNIFAGTWDCPMATYGPGDSDLDHAPNEHLDLAEFDSAIDVLVDVCERLADD
C441_RS14860 acetylornithine/succinylornithine family transaminase 37976 36840 MSGFVFNEKPIAIESGEGPYLYSGDGTEYLDFGASYAVAALGHSHPAVTSAIQEQAAKLTYVQASYPVEVRTELYEKLATLAPGDISNVWLCNSGTEANEAAMKFARSATGREKIVATKRAFHGRTLGSLALTWKQKYKKPYEPVAGGVEFVSYGDEAELAEAVDDETAAVFLEPIQGEGGINPATAEYLQTARDLTEDAGAALVFDEIQTGIGRTGSLWACENAGVVPDILTSAKGIANGLPLGATLCADWIADGAASHGSTFSGGPVVCAAANATLDTIVEEDLPGHAAAVGDYLTTELEAAVEEHDLPVREVRGDGLMVGVEVKRGANRTLKHLALSEQLLALPAGRTVVRFLPPLVIEEEHADRAVDAMTNVLS
C441_RS14865 acetylglutamate/acetylaminoadipate kinase 38905 37973 MTGYTREELLAAHEQLVDNEANLLTDGGKEPPVVVKIGGAKAVDPKGAVSDVAHLVANGTDVVVVHGGSTAVDETLEELGEEPTYVESPSGVSGRFTDERTMEVFSMVMPGKLNTDLTALFREAGVDALGLSGVDGGLLTGPRKSAVRVVEDGKKKIKRGDHSGKITSVNATLLETLLDGGYTPIVTVPMLADDGVPVNADADRAAAAVAGALGAKLVVLTDVKGVYADPDDESTLIETADTPEGFSALESAAEGFMTKKVMAAKEALDGGAAEVVVSDANLNDPIVTALNGGGTHVTPGALVEAEGAEQ
C441_RS14870 N-acetyl-gamma-glutamyl-phosphate reductase 40006 38963 MSDRLTAGVVGGSGFTGGELLRLLDGHPNFDVEQATSRSYERKTVGHVHPNLRHLDLRFTSPEDLESVDVLFTATPHGVSMEHIDAFQDAADTVVDLSADFRLSEAAQYDEWYDGHVCPEYLEQSEYALPELNRENLPGADLIAAGGCNATATILGLKPLFDAGILSGDEQVVVDVKVGSSEGGAGASKASSHAERSGIVRPYAPTGHRHEAEIEEYLGLSVSFTVHAVDMVRGAAATCHVFPDGPVSKGDMWKAFRGSYGDEPFMRTVAGGGGVYRYPEPKSVAGTNFGEVGFEIDPGNRRLVVFSAIDNMMKGSAGQAVHAANIALGLEETAGLDFTGFHPIGSP
C441_RS14875 lysine biosynthesis protein LysX 40878 40003 MHVGLLYSRIRRDEKLLLNELRDRGHEVTKIDVRKEQFDLTEPPESFDGLDVVVDRCLATSRSIYITRFLQSYGIPVVNSHETADICADKAKNSLALADAGVPTPNTKVAFTVESAMEIVEEFGYPCVLKPVVGSWGRLMAKIDSESAAEAILEHKSTLGNYEHKVFYIQEFVEKPGRDIRVLAVDGEPVAAMVRSSDHWLTNAAKGASVDEFELDDRAKELVKQASDAVGGGLLGVDLMETGDDYTVHEVNHTVEFKALNDAVETDVPATVVDWLEAKVDGEQSLAEVSA
C441_RS14880 lysine biosynthesis protein LysW 41059 40880 MSDTITAEDPLSGEEIELPSDVEVGEIIDSPATGAELEVVSLDPVTLEEAPELEEDWGE
C441_RS14885 argininosuccinate lyase 42807 41341 MAGEDGDSEGVIRRDRFSGGPARGFMSSLAADERIFEADLAVDRAHVVMLAAQDIIEAEVASEILAALDEVEAAGHDALSGGEDVHEAIEAAVIDIVGPDGGKMHTARSRNDEVATCIRYRLREDVLSALDAALALRESLLETAAEHTETVMPGYTHLQPAQPTTVAHFLCSYERAVARDCARLMCAYERINQSPLGSAAFAGTPFDVNRELVADLLGFDRVMENSMDASATRDFLAETLSALTTHAVTLSGLAEDLVIFSNKGLVELSDDYSSTSSIMPQKKNPDTMELVRAVAGDAVGELTGLLTTLKGLPRAYNRDLQRAHKHAFRTVDDVAEAAAVAAGAVGSATWPEGELAAAAGDGFSTATGVADLLAMAGLPFRTAHEVVAEAAARSEGTPDVATLDAVATDVLGESLFTHVTAEAVEAALDPTESVASRDSVGGPAPAAVEATLSTARDELSDDAAAVADARDSLAAAAEGLDEEVSSYV
C441_RS14890 argininosuccinate synthase 44026 42809 MKKVALAFSGGLDTTVCVPILEEEYGYDEVVGVTVDVGQPEEEFEEAYETAEALGLEHHVVDAKQEFAQLCLDSVCANADYQGYPLGTALARPVIAKAILELAEEQGCDGIAHGCTGKGNDQLRFEAVWRASDLEVIAPVREMGMTREWEIEYAAEKDLPVQGGNEGVWSIDTNLWSRSIEGGNLEDPGYVPPEDIYEWTDQPSDETELIEIAFEDGYPVAVDDEELEPLELISLLNEKAGKHGVGRTDMMEDRMLGLKVRENYEHPAATTLLNAHEALEGLVLTKEERDFKATVDNEWSQKAYEGLIDAPLVGALNAFVEKTQERVTGTVTIKFEGGQARPVGRESEYAAYSESAASFNTETVDGIEQADATGVAKYHGFQARLANQSAKKQKPELAADGGSDE
C441_RS14895 cation:proton antiporter 45763 44549 MAAALLEFGYLFAVLAVVGAVALRLGLSVIPLYVVGGVVAGPYVAGRFGLPYVPNGEVVTILAELGIVLLLFFLGLEFSLDRLRASGAKIGRAGVIDLAVNLPIGVAIGLVLGWSPVEALLLGGIVYISSSAIVTKTLIDLGWIADPESEPILGTLVFEDLAIAVYLAVVTSLVLGGDGGVAAIGRSLAIAFGFLGLLFVAVQYGTALFARVLDVENQEAFVLRALAVVVPISGAALALGVSEAVAAFFVGMGFSTSGHRERLEHLLVSVRDVFAAVFFFWIGLGTDPTLLAAAAVPLAIAVVVSTPSKVLSGYLGGRAYDLSAHRSLRVGVGMVPRGEFSLVIAALAAAGTTPVMREIIPAFAVGYVFVMSALGTVLMQRSDLVERLVFRGGAADGDASTESA
C441_RS14900 TrkA-C domain-containing protein 46248 45763 MTVYESDLPGVGKKHEVELGDGSRLVIVTHNTGKREVFRRASADSDSEKLFELTDKLARQVGTLLEGAYFQPVQTETIETLLGDNTLIEWVEVGADSDIAGKTLGESDLRQATGASVIAIERGDEVITSPGGDAMVEAGDTLVVIGPKTACRDFVALVKGT
C441_RS14905 hypothetical protein 46894 46292 MGRPSTAEVKRRLVHASGSGMPLLYLLGLVEWRTLGYLFVFLAAVVSVLELLRLFGGLQWAVYDELTREYEQDNVAGYALYVYSQTAVALVFGPHIAVPGMLMLTIGDPISGLMGSAPVGELKSARTLAAMFAVCFILAAPFVIPVSGVAAGGLAAAAGAAGATLADGAKPVVAGYVIDDNLSIPPVACTAIAVTLWLLA
C441_RS14910 glycine--tRNA ligase 48685 46901 MSEGDESAALTELAKRRGFFFGSSEAYGGVGGFYTYGPQGAALKSNVEEAWRERFAVQEGNLEIEAPTIMPEPVFEASGHLDGFDDMLVECAECGESHRADHLIEDNSELEDAETLSPEEAAEKIADLGLVCPTCGADLAGQSVEDFNLMFETNIGPGSSTPGYLRPETAQGIFVEFPRIKEYARNSLPFGVTQVGRAYRNEISPRKSIVRTREFTQAELEHFIDPERDEADLSAVEDVEVLLYPATEQEADDGDYVETTIGEAVEDGIIGNAWLGYFLGIAQEWYETVGVDMDRFRFRQHLAGERAHYSSDCWDAESEVDGDWIEIAGFSYRSDYDLSKHGEYGDDDFTVFQQYDEPKTVERAVVDPDMATLGPEFGARAADVAEALETLAERDPDAFDADEVTLDVDGEEVTVDTDVANFSVETQTEAGEHITPHVVEPSFGIDRTVYTLLAHAYETDEVDGEARSYLSLSPSVAPTNVGVFPLVSNVDELVDLADDVAEELRAAGFAVVYDDSGSIGRRYRRQDEVGTPFCITIDRDGLEGDGANTVTIRERDSGKQVRLPVDDLVGTLVGLRAGTASFDDVLDDNEVVEA
C441_RS14915 CBS domain-containing protein 49532 48678 MKVADAMTRGEEVVTVSLPGTRDDVLEYLQERGFSSVPVVKETDEGTKYRGLISREDLIEHPDEDQLAVLVREVPTASADDDLEAVAATMVSEGARRIPVVDGDAIEGILTVTDVVRAIARGDIDGETEVGDLATRDVNTTYVDVPLHIVEREIFYANVPYAVVLDDEASLAGIVTEVDIIEVARVVEGEAGTGDSVANQDDEWMWEGIKAVGNRYIPTRNVEIPAEPVSKFMSDDLVSVATRVTAKDAAQTMIREDIEQIPLVSGDELIGVVRDVNLLEALYE
C441_RS14920 hypothetical protein 50255 49665 MAELGDDETPEVERIAGEADRRREAWGATLDDMAAMADDFEDEGWTTVRIAAGDSGPFGPSSGKGDEGAFGLAYVIPGDKAETVAELFEATTFPEYEVYRAENDGRVYIVTALFSPETETAVFIAGAWDLREALECATVAVEEGRMYSYLQKLDGTIVGVVEHDDPEKFFPNLDAIRRYAPNGGDGDGDGETDDGN
C441_RS14925 hypothetical protein 50762 50316 MEFDPRRFGVKAIDAVAYAVALTGVVFVVTAVVSTLAGNGLPGAKWLMFFLGFAMFGYSSLKVRPKAAWKRGDGSGGTGGTDGGLFSGSDEPVGVEKLVGDALDAILPPRLRPATGDERPSSGVKLLLASVCILAASYAMEVVFEINY
C441_RS14930 ABC transporter ATP-binding protein 52099 50765 MSTTPTEEERPTTARGETILEVNDLKTYYEDGGLLGSNPVKAVDGVSFDIQRGETLGLVGESGCGKSTLGRTLMRLEEATAGEVKLNGTDITTLSGSDLKQFRKDVQMVFQDPDSSLNERMTVGEIVREPLDVHDWKTPSERRERVRELLETVGLQEQHYYRYPHQFSGGQRQRIGIARALALEPEFIILDEPVSALDVSVQAKILNLLEDLQNEFGLTYLFIAHDLAVVRHICDRVAVMYLGNMMEIGPADELFDEPANPYTHALLSSIPEPDPTVERDRITLRGTPPSPRDPPAGCPFSTRCPVKIRPAAYRDMDPDVWERIEIFREVLRERSRADPSLSDRVRKLLGRESHRAGMDEIIPELFGDLDVPSDVMTHVREAADYAENDDEDAARTYLREEFGSVCDHERPEQLSVGDRGRLSLCHRHDAEYEEPATVFETLVR
C441_RS14935 ABC transporter ATP-binding protein 53175 52096 MSTEQARSTRTGGEPLLSVENLRTSFYTDKEVIRAVDGISFDIFRGETVGIVGESGSGKSVTARSIMRLVENPGRIENGRIMYDGEDLLDKSPKQMRSIRGGSIAMVFQDPLTSLNPVYTVGNQIKEALRLHRGLSGSKATKEAIELLEAVGIPDAHRRVREYPHQFSGGMRQRAVIAMALACDPELLICDEPTTALDVTIQAQILELLEELQEERDLGIMFITHDMGVIAEIADRVNVMYAGEIVESAPVVELFESPKHPYTQGLLNSIPGQGLDENERLATIEGDVPTPNEDPTYCRFAPRCPKAFAECDTVHPEPVDVGDGAGDHRASCLLYPEDLPTEEAVAVHREGGQRGGDTR
C441_RS14940 ABC transporter permease 54611 53172 MSTTTETEIPLRQRVAENPQPALIWAAVGAVLIGVELGAILQVVGAVAGVVVNLLPGDPGASAVQSLQAAFNAVPTLVSRDVIPNQGYWNGTGYENTFLGLSPAVAWFIRVAFVYAYAFAVLAWAWNGYTRFRRHYRRVDWTPRDDVVNRFRGHSWGKFGLVIVITFVVMAVFAPALGPTTMERNILQPYDYEVTYWSEDTQSTETVLVGEANLGSGSQGAGDENVAPLTYDDYGRFHPFGTATNGKDLFTQVVFGARISLTIAVVAMGIAGLIGLGLAMITAYYKGLADLATVLVSDSVQALPVIMVLILMLVIFQNTWVRELYDGAVLIILIFSVVYWPFIWRSIRGPALQVSEEDWIDAAKSFGQTPTQVMRKHMAPYVFSYMLIYASLSLGGIIISVAGLSYLGLGITAPTPEWGRLIANGQQYVASPSWHISLVPGLLITLVVTGLNAFGDGIRDAIDPQADTGDEAAATGGGA
C441_RS14945 ABC transporter permease 55667 54618 MSRWQYFLRRVLMSIPVVIFGTTITFALIRLGPLDPVSAILGTQYNPQAAEQIRTNLGLNQPLWSQYLDFMYELFTFQLGQSWVIAPGTTAYELIEIYAPRTIWLGFWSVLIALFVGIPLGFYAGLNPNTPSDYVASFGGIVWRAMPNFWLAIMLVTALSQLGTWTNGLFTWQTWIVRTNVVTPPALGFFQSPVEQFLADPGGWTESFIRATKQIAPAALVLGSASMGNEMRIGRTAVLETINSNYVETARAKGVSGRSLVWKHIFRNALIPLVPIITGEAFLLLGGSVFVETVFAINGLGWLFFNAAINGDLPLIGTLMFIFILILVGTNILQDFLYTIIDPRVGYDG
C441_RS14950 ABC transporter substrate-binding protein 57705 55879 MPDTNKLSRRRFLKATGGAATAAALAGCTGGDGEETTTESGGGETETTQEDTGTELSGSVFNRILDGTITTMDPVAATDTSSGILIQQVFDCLMSYQNALPTVENELAADYTVSDDFTTYTFELADATYHNGDQVTASDFIYAWERLAASENSRRAYFILDSVGVEHEEDDEGNYVPGTLGLEVGESESELVVNLAEPFHDTLEMFAYTSFAALPEGILGDIEEYDGEMEYTEFASNNPIGAGPFEFAFWEQGTAAAVSKYDDYYGQVAQVDNVRWQVIEDDTARYNYAMNENADYFGLPTAQYDPGLVQVESTDEYGREVGQYGPVRNGKTLNYVGVPTLSIYYVGFNMEKVPKAVRQAFAYVLNQDQMVSEVFKGRGSPAELFTPPTIFPGGAQAASDLVSSDYPYSAGETDIEAARQVMEDAGYGPDNQYEIQWTQYNNNAWEEMASILRDQLASAHINMQIQKADFSTLLERGRNGQLEAYTLGWIADWPAPDNFLQLLNPPQTDTSEQGPISYVNWTAENGDAYQQATDAYQQVVDNPAPTDEAQQVRNEAYVDIETANWEDVAMLPVYNQKEETFWYDTVEIEPFGGMGPSRQKLNNVTLNR
C441_RS14955 DNA-binding protein 58182 58601 MNARAVTVSEEYLARLEHGADWREEIEEFCARKGIESAWFNAMGAVQDAELWFYDQTDQEYQSVTFDEPLEVAACVGNVALLDGEPFAHTHAILSRRSGQALAGHLDSATVFAGELNLRAFEEPLERDHDAVTDLDLWL
C441_RS14960 DNA polymerase II large subunit 58604 63700 MREEETRYFRRIEARLDEAFELAEAAKATGYDPKTEVEIPVAKDMADRVENILGIDGVAERVRELEGEMSREEAALELVTDFVDGNVGDYDSREGKVEGAVRTAVALLTEGVVAAPIEGIDRVEILENDDGTEFVNVYYAGPIRSAGGTAQALSVLVADYARSLLDIDEYKARSDEVERYVEEVNLYDKETGLQYSPKDKESRFIAENMPIMLDGEATGDEEVSGYRDLERVDTNSARGGMCLVMAEGIALKAPKIQRYTRQLDEVDWPWLQDLIDGTIGKDDGKAAKADDAGDDADEADEADADPDADADDAESDAPDGPTRVEPATKFLRDLIAGRPVFGHPSAPGGFRLRYGRARNHGFATAGVHPATMHIVDDFIATGTQIKTERPGKAGGVVPVDSIEGPTVRLANGDVRRIDDPEEAEELQNGVEKILDLGEYLVNFGEFVENNHPLAPASYVFEWWIQDFEATEANVQALRDDPAVDLEDPTVEDALSWAAEFDAPLHPVYTYLWHDISVERFDALADAVAAGEVVAAEADGGTTAPLEHDNEPERGLEGTLVLDNTPEIREALEHLLVAHRQTDETLRVPVWRPLARSLGLTDDRERTWELDDLSEHARTWDGGDNAVEAVNEVAPFTVRERAPTRIGNRMGRPEKSERRDLSPAVHTLFPIGEAGGSQRDVGDAARHRGESGKRGQISVRLGRRKCPDCGAFGFKSKCPDCGGHTEPHYECDDCGTVLEPDESGRVYCERCDWDVESAEWQDIDLNTEYRDALERVGERESSFQILKGVKGLTSANKTPEPIEKGVLRAKHDVSSFKDGTVRYDMTDLPVTAVRPEELDVTADHFRELGYETDIDGEPLRFDDQLVELKVQDIVLSNGAAQHMMQTADFVDDLLEQFYGLERFYEIEERDDLIGELVFGMAPHTSAAVVGRVVGFTTAAVGYAHPYFHAAKRRNCFHPETKVWYRDETDRWRYDEVRTLVEERLDEPETDDFGTLVQELDGDVYVPSLTRSGRETIKPVEAVSKHVAQNHLVRIETRGGRELTVTPDHTVIRADKGGFTRIPAHELDEGDALPSPKRVDIDADPAQFDLLAEFLHGESIPAEDLMVRGLGADRIRSLFDEHTEANGYLKPVAERLGRSESTVYNWVSRDSVPASVFVEVLGDVETVVDVLPRELSLGVRRDTATVSRVLDIDESFGTVLGYYAAEGFTRASHGNLYQTTICIPDELARKRILDTVSEALGVDAFEENEWKVTVSSRLVQSLFADVIGCGSRAEDKRVPDSVLTGPEPVLRSFLSAYFSGDGSASSDRVEIRAHTVSDDLKRDLVAALKRFGIASKTYSERRTPQTGAVAEFYDGDEVPTFDSWVLKLTSENAVRFAEEVGFHPPRKSEALASALDDTGVRSQRLFSDGGDTWLDEVVSVEYLESDIDHTYSLTVEDTNSLVANDLHVAQCDGDEDCVMLLMDGLLNFSKEYLPDKRGGQMDAPLVMSSRIDPSEIDDEAHNMDIVRQYPREFYEATLRMEDPDDWEDEVTIAEEYLGTDREYTGFDHTHDTTDIAAGPDLSAYKTLGSMMDKMDAQLFLARKLRAVDETDVAERVIEYHFLPDLIGNLRAFSRQETRCLDCGEKYRRMPLSGDCRECGGRVNLTVHQGSVNKYMDTAIQVAEEFDCRDYTKQRLEVLEKSLESVFENDKNKQSGIADFM
C441_RS14965 hypothetical protein 66642 66175 MTTYAVLSPPDGSTLHSRERRFAADLCDQLLYLTPGSLDEEPDGEIGDLTPTELANAVYTSTRGRDWNEDDEMYILVEPALLDIETRNAFRRSLRVVFQRFDPEVCFPYDVLEDVGKRAAWLTDSLEAGEIVRPGGVRRQSVTGVDEGQTDLTSF
C441_RS14970 hypothetical protein 67367 66699 MRVRQRQLLYGTVFGLATFLVGWLITYVLTPSDLLTEFPRWKVTLWVFLSAHFVSISGLQLGGLSSAFTQVDLITQIPTLRSLRVVPILLTALGGVMMVEAMNYTTRFKYLIQNSGALLTGYLAAGLLAFVISEAQPGVALIIVLAVLLAGGAYIGGTVTQRFTAGLPVFAVTSLGGVVLIGLLVVLGGLVVLQSIAPLVGVSLVGVTVGAVLAWTARNVPS
C441_RS14975 MarR family transcriptional regulator 67657 67433 MPHELNRADKRILRALENGVRNPSWLAEQLDYSRQYVHQRLQLLVAAEYVNNLGHGLYELEELPNGLGQDSKDQ
C441_RS19140 HNH endonuclease 67761 68234 MGTIVWIMATASTKSMIERPFPKVKDRPTSDVPAIARDGESGMYEVIECPECGSEAFAYLQLRYACRECKVGSRILQKEVNYGGRWERVREWILERDNDQCQRCGDSSVGLQVHHKEKLVWFESIQEANTPENLISLCEDCHGEVEEQPEMACLPVM
C441_RS14980 hypothetical protein 68904 70679 MLRGIEYRSHIVSELSDRPNTRRELGYGTFNGTRRCPHQTNFNEAWNDRFGPDLRAFVTEVVAYVREWAYDNNRLIETTELLVPDERDETQELTKDQIRRIVNEMVGYITPNYSFNREGGVSHDKNVFFKLLAHCALTSSSVHGGGDTFEWQQIDDDDPPHGRTVLDLVKSLSAGEMLSMYADAIDGVVEALDKQVGLYDSPVPLAIDTTTIESDAKWKRVTIPSIADPEYHGWSDRKKKDTYAYIEEEGIECLVGEDPEKIRKARFHDDEKREAAERINDVVRYVHGTKSGDEFKYAWEFGAAAIAHPSCPLIYAMEPLERKDELEDHVERFIERGQELVTVSEVYMDSAYSQVAVQQLFHYGNAFRTEDQFNIPYVMNIREEDSVKKAVLQEKPGRGDITTRMDESEGDISVVKNYRQHSQEHGYGATTLVGLPKRDYENGGVVDVKDPVTDRVAFSTSRTDIDAENAIELLRGPDSESFSDPVRRRKAKHERGYTNRWLIEIGFEKTKDFLAFTKSGHGGVRLFYVLYASLLFDVWMTVDRAIKHEEYELGFDVETYDEDDSVVYATSPRISADVLSTIVANYLRPVT
C441_RS14985 hypothetical protein 70999 71286 MADETEPTVGMTVYAEDGSKLGSIRGFDEDGFYVTIREGLAGMSIEHERAGHEFGEAELMWRCSDCGEMGDLDELPDTCPNCGAEREQLYYWTED
C441_RS14990 histidinol dehydrogenase 71423 72691 MDIQSLDALDESELASVLDRDAGIDEIRADVRGIVEEVRDRGDDALREFSEKFDGVEVDTLDITARTERAAEAIEPELREAIEAAIDNVRAFHERQVRTDWTDSFGGARLGRQFRPLRRIGAYVPGGSATYPSTSIMTVVPAKVAGVEEVVVVTPPAEELNPVTLAAIGLAGADEVYSVGGAQAIGALAYGTETIPPVEKVVGPGNRWVAAAKAEVRGDVAIDMIAGPTEVLVVADETASPRLVAAELVGQAEHDPHSSVVAVTPDESLARAVADEVEAQVPDRARPEIIREALANETSGVFVAPSMDEVTAFAEAYAVEHLVVMTADDEATVDAIDSAGSVFVGEYSPVAAGDYASGTNHVLPTGAGAKTTGGLSVDSFVRSRTVQHLDRDSLDSLSDTIVTLAELEGLEAHAESVRKRFE
C441_RS14995 SDR family oxidoreductase 72773 73552 MSVLDEFTLHGKTAIVTGSSKGIGRALAVALAEAGADICLVNRSEREGRLAAEEIAAETGVETLAVPADVTDEDDVEAMVEATLDAFGSIDILVNNAGIARTAPAHEMSLETWNEVLQTNLTGVFLCTKHAGKAMIDGGGGTIVNMASMSAFVANYPQEEVVYHASKGGVVSFTRQLASEWAKYDVRANAMAPGYIRTEMVDELLAENPDMESAWLSEMLMEEMAPPEDLGGTVVYLASDASSYMTGETVVIDGGYTVR
C441_RS15005 extracellular solute-binding protein 76044 74863 MRKHTNSQPSARATGRSRRRFISAAGATALAGLAGCTGGGGSDGDSGGDASSGTTTGTEDTTPPKPDSITVRAWGGAWQENLDKHIAQAFTDETGIEVEYDNSTEEEMQGKIRTAITQDRTPPVNVNWSLSKTSYRSYQMGLMEPLDTEVAPNLEGLLGASKPEVEDAEWPFVNLYSYTYALTYNTDLVSSEPTSWSVWWDDEWENSIGLYPGGHGITPLIAKMTGTELGPVEEMTPVWDEYTALKPNVGTIGDDSHLTQNLRQGEVAMSVMLPANIINAQDDGAPVDYTIPEEGARAGRDTMWTPTNQDERYVYWGQKFIDTAANAENLGPWSTELGVAPLHADATIPDWMRDSVAFPTSEEQFNQMITVPLDLLIEHQAAWESRVNEMMQA
C441_RS15010 DNA-binding protein 76432 77082 MYEATLRIDHQSPYADVTLGKDVHVEMWCNQYCDLVYVSGSDIETPVETFDDTVGIQEIVYKDEEAVLITDSCLLDYRDNLLEGYLQPHQCLSLPPLTYSDVALFARVLALTEEQLSGVYHSISDSHRVTVEAKREIRSIAPDIPILMLDSALPTLSDGQQRALSLAVEMGYYEIPRGSTTGEIADEMGVSRRTFEEHLRRAENKIIKNLLRYFLT
C441_RS15015 ABC transporter ATP-binding protein 77181 78272 MTDSSETVVRIDGVTKEYGTLRAVDDVTIPITDGEFITILGPSGAGKTTLLHMIAGFQKPTSGEIYIDGRPVSDEPPYERDIGLVFQSHALFPHMTVKENIAFPLKMRREHTDDIDRKVEDVLELVRLPVDYADKPVDELSGGQQQRVAFARAIVYEPTLLLLDEPLSSLDKKLREEMRAELTRIHEETELTIVHVTHNQTEALSMADRIAVINGGGLEQLDSAQDIYASPNTPFVADFIGNTTLLSGSVAGVDGDVATVSVADGTVRVPAATVDGLSDVVVAIRAEQVALDADCDNTYPATVEQVSFEGDRTQYHVFVPAFGETVRLIDQSADARVVHDRGDSVTVGWNVGDGFVYPDEDDE
C441_RS15020 C-terminal binding protein 79304 78345 MLDPDWFGDVESERAHFERLLGSSVVVDAVDCTDAEIPDAVGAADLLLSHDTGVSAESMDATGCSVVSRYATGIDGIDVEAATDRGVRVTRVPTYCNDEVGMHAVSLALALVRGLPTYDAAAADGRWHWADAMPIRAVSELTFGLLAFGNKGRSAGEKALALGFDVCAYDPYVDDGDIEAAGVRPVGFEALLAEADVLSIHAPLTDETADMIDADAIDALDDDAIVVNTGRGRVVDEGALLDALRGDRLRGAGLDVLRAEPPNPENPLLDREDVIVTPHAAWYSTRTAEKLRRLGTETAVAAYRGDAVEGLVNPEALDR
C441_RS15025 pyridoxal phosphate-dependent aminotransferase 80547 79396 MMKRPGLSGLSTRMPQSGIREVFDAAQSYDDLADLSIGEPDFATPEPIAAAVSEAVGTGASSYTETVGRPELRAALAEKLAVENGIDAAPESEIIVTPGAMGALFAAVNVLCDPGDEVLVPEPYWPNYHGHVASANARLAPVSTDEAFVPTPDAVEAAVSDDTTAILLNSPGNPTGAVIPPDRLRALDDVAAEHGLWVIADETYEDLVYDGATHHSLASDGDRFDRIVTVHSFSKSYAMTGWRIGYASAPERVIDAMRVLQEHTVSSVPEPAQVAASAALANRYVVGELREAFAERRRLVLDRLAAIDGIDPGTPRGAFYVFADVSAHTTDSRAFVERLMSEAGVASVPGAVFGAAGEGYVRFSYAADVETLTTAMDRLERAL
C441_RS15030 RidA family protein 81045 80602 MTRYAINPPELKDARSIGYNHAIIDSQQFYMAGQVAMDADSNVVGDDIETQARKAYENVGILLDAIGMTYADVAKVTTHIVDAKEHYYDGYKEVYLETFEEPYPCHTVLGHDQLANEDYLVEVEVELPLTEADVDAIEPDGDVVREL
C441_RS15035 Zn-dependent hydrolase 82354 81107 MNIDTDRLRRDLQENAAFGDVDADGGVGRTVLTGSEADRGVRERFVERLEAAGLDVRVDAVGNIFGRWTPPSCDPSAAPVAFGSHLDSVPRGGIFDGPLGTYGALEGVRAMQEADVAPARPVEVVSFTEEEGGRFGVGTLGSSVATGKMGVDEALALEDDDGVTLRDHLDRVGFAGSDAVDPAEWDAWMELHIEQGTRLTGANAGVGVVDSITGITNCEVAVTGEADHAGSTPMYERSDALAAASEFVLDLERAAEELATTSEAAVGTAGKGTIEPNARNIVPAEVRLQLDIRDVEHETMDRLVERCRTSLARLERNRGVETSMSRYRDSPPSRMADRCLAAADAAAATRSVDALRLHSAAMHDTANLAAVTDAGLLFAPSRDGVSHSPREWTDWDDCATATGVLAETVRSLASD
C441_RS15040 lactate utilization protein B/C 82469 82975 MSDTDTQLFASSLAAIDVPLTRTDTAGFASTIDSLVTEPAVGVPLHLEGVSLDDTVVQTPPTPRLLREAETGVTGVGHAIAEHGTLVIDSDDAGTEPVSLYPPTHLAVVRESDILPDVASATASVGERFAAGGSAVFATGVSSTGDMGALVEGVHGPKHVEVVVLTDQ
C441_RS15045 4Fe-4S dicluster domain-containing protein 82972 85182 MSAQSERRRKARKLRELLDAEGDAVYENVTHLNQGRYDAIERFDEFETLRTEARSIKEAAIENLPSLIETLRESVEANGGSVYVADDAADANRYIRDVAEANEAETLVKSKSMTTEEIELNESLEAAGVDVWETDLGEFVLQVADESPSHIVGPSLHKSREEVAALFNDVFDPEVPFETAEELTEFARDYLGDRIREADVGMTGANFVLADSGTITIVTNEGNARKSAVTPDTHVAVTGVEKILPSAAELEPFVNLISRAATGQDISQYVSMLTPPVETPTIDFENPDEPLGSGDAEREFHLVLIDNGRLAMRDDDHLRETLYCIRCGACANSCGNFQHVGGHAFGGETYTGGIATGWEAGVEGLDSAAEFNDLCTGCSRCVNACPVKIDIPWINTVVRDRLNGEATPSDLEFLVEGLTPDAEGTVTPQKRLFGNFGTLARLGSATAPLSNWVADLGPVRSLMDRYLGVAPDRELPTFQRETLVDWFQSRRPRAPNATATRKVVLYPDAYTNYVLVDRGKAAVRTLEALGAHVELATPIESGRAPLSQGMVATATRQAEAVASDLAPYLDSSFDVVVVEPSDLAMMRREYGKLLDDETHARLSEHSYDVMEYVFGLLSNGADPAALAAPDAPVAYHSHCQQRTLGVETYTEAVLDDLGFDVVTSDVECCGMAGSFGYKSEYYELSMDVGAELADQFADAGDRVVVASGTSCTEQLTDLLARDVTHPVELVAPDEAP
C441_RS15050 carbohydrate kinase family protein 85285 86187 MADVLVLGDAIVDAVFAGLDRYPDRGEELVAPSFELRPGGSAGYASLGLAALGSRPAAVTYVGDDVLSTHWRTFLDARGVDTSRVVEQPDATVSVAAGFLFESDRSFVTHRGAMTDGAIPAVTPDGFDALFVAGLSQAPYLWGDDAVAIAREFSARGRPVFLDTNWSPGDWLSTAEALLPHVDTLFANDAEARRLSGCDDLVAAAETLVADGADACVVKAGDRGCLVADGQSEATWVETEPAASVDACGAGDFFDAGFIHARFAGADRPAAAAVANRCARAAITNFELRAKLDAIADLSE
C441_RS15055 ABC transporter permease 87039 86230 MSGLLRRFGTDHPYLAAFTLFEFVFLILPSVIILVVSLGANQIISFPPTELSLRWYASLIQEPGYLEPFFNSVMVAVFCTALAIPIGVMTALGLNRYDVRFEHGIQIYLLLPFTVPLVVSGFILLIIFGRLGWIAELWAVGLALTIINIPFMIWSVSSSVNAFDSTLEDAAQSLGAEEIQTFRYVTLPALMPGVISGSFLMFMLALNEFIVSLIITTTATETLPVAIYGAIRGNISPQIAAVASIYVIIAIVAIVVADRVVGLERFLHS
C441_RS15060 ABC transporter permease 87941 87036 MDGLIAAKNRLPLDTSSVADRFGYVLIAPSILLVSFLAVGMVMLAWYSVLTYDTIEIFVYELTLDNWMRLVETGAFHTVFFRTLLYSALVTVGSVGLAVPYAYLVVRTRRPVFRYVLLFGLFVPFFTGVIIRAYGWLIVLGKNGLLNWGLNALGIETVSIIGTPAAIVVGLLQYLMPFAVLMLTPAIASIDSDLERAAQNCGANNWETFRHVVLPLSRPGITAATIVVFTLSMANYAIPNLLGGGTNSFVANFIYSKVFDMMNYPFAAVLCIILVLVASVFVFAVFKLYGTGTLGVEAGEQ
C441_RS15065 ketopantoate reductase family protein 88284 89180 MRILVFGAGSLGTLVGGLLAPVHDVTLVARDPHAARVSSAGLDIVGAESAHVSPAATTTETGHDADLALVTVKSFDTAAAADALADCDVGAVLSLQNGLTEETLASRLDAPVLAGTATYGARLVEPGRIECTGVGRIVLGALDGGPDPLAERVGKAFRDAGLNALVATDMPRRRWEKLAVNAGINAVTALSRVENGALAGDDASELAHRAARETARVARLERVSLPNRVAREALDRVVEKTAANRSSMLQDVAAEKRTEVDAINGAVVDIAAEHGFEVPTNRTLAALLRAWERGAGLR
C441_RS15070 sulfatase 90761 89220 MTAESPENVLFVVMDTVRKDHLTPYGHDRPTTPGLDRFADEATVFEQAVAPAPWTLPVHASLFTGMYPSQHGADQENPYLEGATTLAETLSAAGYDTACYSSNAWITPYTHLTDGFAEQDNFFEVMPGEFLSGPLAKAWKTMNDSDALRTLADKLVSLGNTAHEYLAGGEGADSKTPAVIDQTIDFVDGSEEFFAFVNLMDAHLPYHPPEEYKERFAPGVDSTEVCQNSKEYNAGAHEIGDDEWEDIRGLYDAEIAHIDDQLTRLFDHLKETDRWDDTMVVVCADHGELHGEHGLYGHEFCLYDPLINVPLMVKHPGLDADRRDDQVELVDLYHTVLDSLDVEGGEPANPGDDAVGLDRTRSLLSADYREFAQASNDDPGQRRDGEYAFVEYSRPVVELKQLEEKASSAGITLPEDSRFYSRMRAARRTDAKYVRIDRIPDEAYRIDADPEESENVVDSDDEAIAETEAALTEFEAAIGGAWTDALDTDVSDDSVDQMDDEAQDRLRDLGYLE
C441_RS15075 hypothetical protein 91216 90839 MSTETQDGEDDLKERVVNFLRRNFPQIQMHGGSAAIRDLDRETGEVTVLLGGACSGCGISPMTIQAIKTRMVKEIPEINEVHADTGMGGDGGMGGDSRGGSPSFPGDTSDDTTDIEDDQGPQAPF
C441_RS15080 hypothetical protein 91578 91261 MTDFDPEKFEDKYANYFPELQRAYKNAFETMNDEFDSSLIHGVDQQILNESEPFYEDGDFSVRLPENPAERLTAVEIDDDELEAVLDRYVDEIEGELRRVFGVEA
C441_RS15085 hypothetical protein 91957 91736 MATANSNEAEYDYPVGYEPEVQTETVELDRRTVERLDALREDDESYDELITELASIFAASELSAARVDSPLIE
C441_RS15095 polyphosphate kinase 2 93191 94021 MSTDNPRYNDEGVLKKKQYKRELNRLQEELVKLQYWIKEHDLRVCVVFEGRDAAGKGGVIKRITRRLNPRVARVVALGKPTERERGQWYFQRYVEQLPTEGEMVLFDRSWYNRAGVERVMGFCTDEEYEEFLRTCPEFERMLVRSGIVLVKYWFSISDEEQERRFQKRNEDPKRRWKLSPMDLEARSRWADYSEAKDAMFEHTDIDEAPWHVVNADVKRHARLNCISHLLDQIDYEDLTPDPIELPSRQDDSGYERPPIDSQNWVPARFGSNPVDD
C441_RS15100 DHH family phosphoesterase 95448 94018 MDGPVPELTEHAAACAARLRDADRVLLASHIDADGLTSAGIASTALERAEIPFETVFCRQLDAAAVADIAATDYDTVLFTDFGSGQLDIIADHEAAGDFHPVIADHHQPAEGVETDHHLNPLRFGLNGASELSGAGASYVLARALEPEGRDNRDLAALAVVGAVGDMQDTNGELRGANAGIVDEGVEAGVVEEGTDLRIYGRQTRALPKLLQYATDVRIPGISNNEAGAIEFLTELDVPCRDDDGEWKRWVDLTGDERQTLASALMRRAIASGVPASRIDDLVGTTYVLSREREGTELRDVSEFSTLLNATARYDRADVGLAVCLGERDAALDRARRLLRNHRKNLSNGLQWVKEHGVRVEDNLQWFDAGDEIRETIVGIVAGMAVGTDATRSGIPVLAFADTEGGEVKVSSRGSYVMVRDGLDLSAVMREASRAVGGDGGGHDVAAGATIPKGEVEAFIAEADRIVGEQLAKNPD
C441_RS15105 MFS transporter 96804 95611 MTGSEAEDGTRDGSWRAVGSVAGWQTAASLCYYAIFAATGFVRDAFSVSESLVGLFLTAGLLGYTVFLFPSGAAVDGYGEKPVMVVGLLALSVALVGVTFAPPSYALLLVTVALLGAAYSTAMPASNRGIVAAAPAGSKNLAMGLKQVGVTVGSGASSLVVTGVAVVAAWQVGFWAIAVFAAGYALLFATRYRGNPGTGRLERPRLAGLGGNRAYVALVAAALFIGASIFSMLGYTVLYVQDVVGTGPALAGGVLAATQVTGSVGRIGAGSLADRLGGPRGAATVALVQLAGSVALFSLLVGAGGSFALTIAVFVALGLTIHGSTGVFYSCLSGVVDDGDIGAATAGGQTALNVGGLVAPPLFGFVVESTGYGAGWALVAGSTVLATVLLFVVRRRI
C441_RS15110 uroporphyrinogen-III synthase 97549 96806 MSQQVRAAVFRPDDDRIERAVELLDSLGATPVPDPMLAIEPTGATPEAGEFVVLTSKTGVELLDEAGWEPGDAVLCCIGPATADAAREAGWTVDRVPDEYTSAGLVDHLASDVDGATVEVARSDHGSAVLTDGLRDAGADVHETVLYRLVRPEGAGKSAELAAAGDLEAALFTSSLTVEHFLDAAAERGVREEAIAGLNDAVVGAIGPPTRDTAESHGIDVTVVPDEATFEALAVAAVETAAPTYHE
C441_RS15115 uroporphyrinogen-III C-methyltransferase 98400 97546 MTGDDRSNGDSGESAEGVGTVHLVGSGPGDPELLTMKAARLIDEADVVLHDKLPGPEILGMIPAEKREDVGKRAGGEWTPQEYTNNRLVELAREGKDVVRLKGGDPFVFGRGGEEMEHLADNGIPFEVVPGITSAVAGPGSAGIPVTHRDHVSSVSFVTGHEDPTKDESAVDWDALAATGGTLVVLMGVGKLPLYTAELREAGVDGDTPVALIERATWPDMRVATGTLDTIVDVRDEADIEPPAITVIGEVAATRDRVKQFLQGGGADAVAEADATAGGSGGDE
C441_RS15120 hydroxymethylbilane synthase 99493 98402 MTTTLRLATRGSALARAQAASVQGALASRRLDVELVEVETTGDRIQDELIHRLGKTGAFVRALDERVLDGEVDAAVHSMKDMPTEKPADLVVAGVPERAPAGDVLVTAEGHDIDDLPEGAVVGTSSLRRQAQLLNYREDLEVEPLRGNVDTRIEKLLATHLQREHEARVENDKERQEKKGKTDHDEAFDETADEWFEGLTELERQALGRKVETEYDAIVLAEAGLKRSGLTEKVNYERLPRTTFVPAPGQGAIAVTAADAEVIDLIRDKLDHPRTRVETTVERTILAELGGGCIAPVGVHALLQGEHVHVDVQVLSTDGSESIKTSRDLPVGNHAHAAREFAAALRDRGAGQLVEAAREEAEE
C441_RS15125 hypothetical protein 99600 99989 MPAFDSAVRQVGDLVVVALLLFGLTSVVAPLDVFLSSVGVEPPWFAGLAAAALVGLVLLLARPLRLRLVARVWGVGLVVTALWIPLLILLELRGNPVGVLVSWAVCLGVGVALTYPPLWRAAEARLRVE
C441_RS15130 hypothetical protein 100035 100310 MRERTPVARAGSPRRSACRTVFGLRGDDSNVNRSLHPALVFGVAFVVALPLGFIFAPDPTGVAPLFLTAGLTVVIGLPAYLGLSRATGPES
C441_RS15135 glutamate-1-semialdehyde 2,1-aminomutase 101682 100345 MNHDESRALYDRALSVLAGGVNSSVRATQPYPFFVERGDGAHVIDADGNRYVDYVMGYGPLLYGHDAPEPVQSAVQKHAAAGPMYGAPTEIEVEHAEFVERHVPSVEMLRFVNSGTEATVSAVRLARGVTGRDKIVVMQGGYHGAQESTLVEGGPGGAKPSTPGIPSSFADHTIPVPFNDEETVREVFEEHGEDIAAVLTEPILANTGIVHPVDGYLEALRDVTEDHGALLIFDEVITGFRVGGLQCAQGKFGVTPDLTTFGKIIGGGFPVGAVGGKAELVEQFTPGGDVFQSGTFSGHPVTMAAGHEYLKYAAENDVYGHVNRLGEKLRAGITDILEDQAPEYTVVGTDSMFKTVFTRHGAAERADACADGCAQVESCPNYDACPKTGADVSNAETDRWERVFWQEMKDHGVFLTANQFESQFVSYAHTDEDIEETLEAYKAAL
C441_RS15140 CPBP family intramembrane metalloprotease 102528 101821 MVTVAVLGGLAVALFGMAAVGLLADRYLAEPDGLRADLLVSDANKWAVFALLCGYVLVVEGRPLSSMTGRSLAPLAFVAVVGGGVFVLFVANAVTTPVFDRLGVGGLDEGMAGLASLSVRHRLFVAGTAGITEEVLFHGYAIERLLELTGSPLLAGGVSFAAFTASHAVGWERGAVARIAVPALLTTVMYLLVRDVVALVCIHALNDAVGLLLAGSVEDAGDAEGADAAADGTAR
C441_RS15145 P-II family nitrogen regulator 103054 102683 MSDSDLPNDGDIKMVMAVIRPDKLSDVKTSLAEIGAPSLTVTNVSGRGSQPAKKSQWRGEEYTVDLHQKVKIECVVADIPADDVVEAIADAAQTGEKGDGKVFTLPVESAVQVRTGKTGRDAV
C441_RS15150 ammonium transporter 104428 103058 MLTPLQTDLASVVEGVNLVWVLTVTFLIFFMHAGFAMLEAGQVRAKNVANQLTKNLLTWSIGVIVFFLLGAAVSSIVAAFTGGPATTISGAFMSLYAPEASATTAWVDWLFGAVFAMTAATIVSGAVAGRARLRAYLTYTILIAGVIYPVVVGFTWGGGFLSALGFHDFAGGMIVHGMGGIAGLTAAWIIGPRMDRFKPDGTANVIPGHSITFAVLGTLILAFGWYGFNVGTAAAPLAYADGGVSLGSFAYVGRVALVTTLGMAAGAIGAGGVAMYKTGKVDTLYVANGVLAGLVGITAIANDIVWPGALVVGLLAGAQLPVVFEFVEKRLQIDDVCAVFPVHGSAGVLGTLLYPVFAVPLWHDGASFVSLAVPQVIGVGVIAIWTFVATAVVFGGFRAVGQVRVSSDHEREGLDTAEHGVDTYPEFGSPDADTGIRTDGSGVPSGDGFMTTGRED
C441_RS15155 P-II family nitrogen regulator 105173 104808 MSDEADEQGIKMVMAIIRPDKLSDVKTSLAEAGAPSLTVTNVSGRGSQPAKKGQWRGEEYTVDLHQKVKVECVVADIPADDVVEAIADAAHTGEKGDGKIFVLPVESAVQVRTGKEGKPAV
C441_RS15160 ammonium transporter 106588 105170 MGWSVVIPLQVDPGVIAQGVNYVWILVVSFLIFFMQPGFALLEAGQVRAKNVGNVLMKNMTDWALGVLVYFLVGAGVATIVGGLTSSGGFDVAAAFSYIGDSGAWIDWLFGAVFAMTAATIVSGAVAERMDFRAYIVFAATITGFIYPVVQGITWSGGLLSSGGYIGAALGVGYLDFAGATVVHMCGGVAGLVGAKMVGPRKGRFDANGNSQPIPGHSMLLAVLGTLILAFGWYGFNVGTQATVLATTESGGLEFMGAALGRVALVTTLGMGAGAVAAMIVSTSYQGKPDPLWMANGLLAGLVAVTGAVPHVTWWGGLILGALGGAIVLPAYRWTVDSLKIDDVCGVFAVHGVAGAVGTALIPVFAVSGFSGTQLVMQVVGVVVIAAWTIIASAIVFAIADAVFGLRVSEEEEEEGLDAGEHGVSVYPEFVGDAGPDGALGSPTATDGGSDVRTDGGVVKDETVEANGGDSQ
C441_RS15165 porphobilinogen synthase 107938 106955 MEFTDRPRRLRTDGIRPLVSETTLDATDLVAPVFVDATTDERVPIESMPGHERVPVAEAADRVAEVREAGVEAVIVFGIPESKDPEGSRAYARNGVVQEAVRAITAETDAYVITDVCLCEYTDHGHCGVLEDHAEDDPTLTVENGPTLDLLAETAVSHAEAGADMVAPSASMDGMVGAIRAALDDAGHDEIPIMSYAAKYESAFYGPFRDAADGAPAFGDRRHYQMDPANAREALREVALDVEQGADVLMVKPGLPYLDIVRAIREGYDHPVAAYNVSGEYAMLHAAAEKGWLDLDAVARESLLSMKRAGADLIITYFAERVARQLD
C441_RS15170 hypothetical protein 108042 108554 MSAFTAQLPHFLRDLLASDLALVALFFVFVLEGAMLLYVAPSELLVPGALALVGDRLLLPILGVAVLGATVGQVGLFLVAKRGGREYLLSRSWFRVSEDSLDRFDGWFDRWGPVVVPLSNAMLFTRGMLTVPAGLSGMSAKRFLALSALGTVAFESALAALYVFGGQVLA
C441_RS15175 PHP domain-containing protein 109669 108875 MDDSPPVADLHLHTTASDGRLTIDALPDAARAGGVDVVAVTDHDRYHPDLDAPVVERDGVTVVRGIELRVDAGDRRVDLLGYGLRERPALVEVVERLQRNRVERAREIVAAVEAETGVDLDIDIHDGVGRPHVARAVDASEAPYDYQGAFDELIGADGPCYVARELPTFDDGVSALRDACDVVGLAHPFRYRDPAGALELCADLDAVERYYPYGFDVDCDLVEEAIERYGLLATGGSDAHDETLGRAGPPETDFARFAAAVDGL
C441_RS15180 hypothetical protein 109802 110086 MRLNGLGDAIDRHEYPISSTDFAQRYGDKVIELQNGQETVAQILARLGDETYTCPQDVRDALFTGVGHEAIGRRYYSDRDPSPLGENGPEMVSF
C441_RS15185 hypothetical protein 111239 110232 MARPLRFRYAPGRWDESRITRDVFQPLDANLGAEMGAPWYAPPDGYEARRFDMDNGDTALFAWSDDHAYWIGNTETPSSLWRTDKEGFAEAPFEVSRWAQRELIAELFDQSPWLKPYPHLSWFFLPVFLSKDGRETTREFFRDHAAGFPDATREEALAFYESFFATGVLDEYREVMAGKLGTSEYFDPIRMAAAMGEFDVAYLLDEAGYDITPEIAVTTGHSIDFRAENTPAGGALIEVTRPLPPNRRSVSNPIAAIRDTAQTKTNGEGQLAEHGGGVTLFVDCSSFPDDDWAAIMGERPDVRHRPAVVFRLRPSGEVEGYSKGSVPVDLPWLSD
C441_RS15190 hypothetical protein 112309 111614 MNDAPEYGETWVYESIVGALPGVNIGEAGAIALQIAVFEVAVLVFAWVYDLWAAAVAGTAAIVVAAAGSVVMLRMGEWTRAARVPAPYRRLLFGSSIEVVLGVLAFVALVTHLFVFDPRQPGTPLMTTLFGPEPPVVVVYLTLLVLWDLCYRIGTSWWAAVVGLWGAVRYRFDGPTAKTVRRVNLLNVGFAVAQVLLLPFVADQPVLLLAVGGHVVAVTVVSVTAAALTRT
C441_RS15195 NAD-dependent epimerase/dehydratase family protein 113333 112395 MRVLVTGATGFVGRHLVPLLLDAGHDVVVFVRDAARYDGPPGADVVEGDIFEPATLDPAMAGVDAAYYLIHSMHAGGDFEARDRLAARNFVDAAESAGVGRIVYLGGLGEDRDRLSAHLRSRREVEHILASGAPALTTLRAAIIVGAGSASFEMVRQLATRLPVMVTPQWVETRCQPIAIADVVAYLAGVLDHPETAGETYEIGGPDVFTYREMLQRVGAQLGHAPRIVPVPVLTPRLSSYWIGLVTDVDTGVARPLIEGLKNPVVVGDDRIRDIVEVEETPFDAAVERALREERGTRENAAEPEPEPTTPT
C441_RS15200 zinc/iron-chelating domain-containing protein 114329 113445 MRVDCEGCAGCCIDWRPVAPVALDHERRGPRAPLDDTYNLVPLTRDEIRDFVAAGLGDSLTPRLWEAPPGEGVEIDGVELAAIDGNPAFFVGMRKPPKPVAPFGLERTWLRACAFLDPETLQCRIHDTEFYPDECAEYPGHNLVLEQETECERVERHHGGERLLDDAAPDDLHGLLLGPHALGAKLFVHPEPERLAGTIEHLETRDLTPEDRAEFVGVAVGSHPGSTEVDEERASRARAKTLESESWAGETVAAWNAVAGRLGSAAADAPDPDEVEVARGAPETPGWDAVRGGD
C441_RS15205 hypothetical protein 114447 114650 MTSEPCDACGKGVRIAGGIGDLWNFPTSSSGGMTLELVDGSEHFLCFDCMERLPGDREPTAEDVAAL
C441_RS15210 hypothetical protein 114879 115085 MDSEPWADYAKMSTESGGGILRRRWATLSKKFSSLSTWEKLSRAREVLKTAYWLLKVLIALSTLVVLL
C441_RS15215 ATP-dependent DNA helicase 115419 117329 MDPSRIIDSFPAPSFRGAQERALRDIRDAFADGNDVVLVRAPTGSGKSLLARAIAGSAATVDETSPAQATDAYYTTPQVSQLDDVAEDDLLSDLKIIRGKSNYNCIVCGEEDTPVDRAPCARKRGFDCSVRHRCPYFSDRAIASNRQIAAMTLAYFMQTAGSDVFRKRDVVVVDEAHGLAEWAEMYATIDLKPRTVPVWDDIGVPDVAAAGDPVERASRFAEALVGVCKREKDELLTKSELTPEEAARRDRLQELIGELQWFVEDYRDPQSPTTWVVDQHDGEGSPIAIKPLDPAKYLKHTVWDRGNKFALLSATILNKAAFCRSVGLDPSKVALVDVEHTFPVENRPLYDVTQGKMTYEHRDETLPKIARLAVRLMAKHPDEKGLIHCHSYAIQAELRRRLAEMGLGNRVRGHDRDDRNVELETWKATDRPKVFLSVKMEEALDLKGDLCRWQVLCKAPYLNTNDSRVARRLEDGQWAWYRRAALRTVIQACGRVVRAPDDYGDTYVADSSLLQLFDKTRTDMPDWFAAQVDRMSEPDLPEFDPVAAGGRGGNAGDAVGGTANLGGSTSRSSGASGAGSGSRRDSTRGSGSSGSRAGGSGGASAGGSGGRKSGGSADGSDDGGRSNHPLSDVWGE
C441_RS15220 hypothetical protein 117736 117930 MHTNALATAMTLYRSGALTLSQAAARSGYSEDDLLLALQRHGVPVHEDDAPAASLSADRPASAD
C441_RS15225 nonsense mediated mRNA decay protein 119197 118067 MSESRDFCPRCGDPVPERPEPLPGMPRERDDVLCDSCYFEDFDLVDAPDEVQVRVCAQCGAIHRGNRWVDVGARDYTDIAIEEVSNSLGVHLNAEEVRWGVEPEQVDENNIRMHCHFSGVVRGTPLEESIVVPVSIARETCQRCGRIAGGYFASLVQVRADDRTPTTEEAETAIEIAESYIAEREEKGDRDAFISSIKRTPNGPNIKISTNKMGSGVAKQIRERFGGTIEEHPTLVTEDGDGNEVYRVTFVARLPRYTAGDVIDPEDGDGPVLVRSNRGRLKGTRLTSGDPYESEFEEGERPEAEDLGSVEDAEETTVVTVEDEHAVQVLDPETYEAKTIPRPDYFDPDAAMVPVLKSRSGLHILPEEVVDDEDEE
C441_RS15230 phosphoribosyltransferase 119940 119302 MFADREDAGRRLADLLDEREETADLVLAIPRGGLPVGREVADRLRAPLDVVVASKVGAPDNPELAVGAVAGDGSAWWNEDLLSYLDVGDDYLDREREREAEAAREKVSLYRGGDPLPDVAEKRVVVVDDGVATGATARACLRQVVAGDAERVVLAVPVGPPHTLSELEAECDAVVAVESPEAFGAVGAFYRDFAQVSDEEAASYLRDGAGEH
C441_RS15235 zinc metalloprotease HtpX 120065 120946 MNWKPDWGLRGRMALTMFLLFALYIVFAGVLFAYFQSLAVMAGFMGVFLFAQFFFSDKIALYSMGASVVDEDDGPQARKLHAMVGRLSQQADLPKPKVAIADTRVPNAFATGRSQKSSAVCVTTGLMDTLDDDELEGVIAHELAHVKNRDVMVMTIASFLSSIAFLIVRWGWFFGGDRDRQNMPVIVAILASLVVWIISYLLIRALSRYREYAADRGAAVITGRPSALASALLKISGRMDNVPKRDMRDTSEMNAFFIIPIKSDFIGRLFSTHPSTENRVERLRDMEREMETV
C441_RS15240 hypothetical protein 120953 121558 MGLFDSFRAMLGISAESDATTKADPEDLFGMSTAYLTMEADLRFDSADEAALCFSSVDSTDFADTVDAVEDILTAGSEETGTEFRRHTDGHGYNWVVLADDDPEDLVTSVHFAADEFIERGYGSRLLAALFGFERDGDRAYWIYSFRRGAYYPFAPQGTSTRNQSLEFKLESVLDGELEIEDDESYWYPLWPSTPNGHPWD
C441_RS15245 DNA repair and recombination protein RadA 121797 122828 MAEDDLESLPGVGPATADKLTDTGYDSYQSIAVASPGELSNKADIGSSTASDIINAARDAADVGGFETGSMVLERRQQIGKLSWQIDEVDELLGGGLETQSITEVYGEFGAGKSQITHQLAVNVQLPPEQGGLGGGCIFIDSEDTFRPERIDDMVRGLEDDVLEATLEDRGIEGSVDDEETMQALVDDVLDKIHVAKAFNSNHQILLAEKAKELAGEHEDTEWPVRLLCVDSLTAHFRAEYVGRGELAERQQKLNKHLHDLMRIGDLFNTGILVTNQVASNPDSYFGDPTQPIGGNILGHTSTFRIYLRKSKGDKRIVRLVDAPNLADGEAIMRVQDAGLKPE
C441_RS15250 NAD(P)/FAD-dependent oxidoreductase 123020 124165 MTHRIVVVGGGTGGTVVSNRLAEELESEIDAGDVEVTLVSDDTKHVYKPVFLYVPFGVAEPEDGVRDLRDLVDERVNIVTNRVRRVETDEKRLYCQDGDEVLDYDTLVLATGAKLDPDAVPGFREGAHHFYSAEGAERLRDALAEFDGGRLVLSVVGTPHMCPAAPLEFTFIVDDWLREQGLREDTDLVYTYPMERSHGKPEVAEWADPILAERGVRVETDFVVDAIDPDERVVSAKSGAEVDYDLLVGIPPHRGDELIVDSGLGDAGWVDVDQRTLRATAAEDVYAIGDTAALPIPKAGSAAHFQAFAVAERIAAEVRGRTPTKRYDGKTVCFVETGLDEASFVSFDYETPATMRQPSKPIHWAKHAYNESYWLTARGML
C441_RS15255 DUF1641 domain-containing protein 124176 124589 MQDHESTTTTEQQVPDELVRAIENNPEEVALLVERLGLVNDLIDVLELGVGALDDEMVRSLARTGTSLAEVADDASDPDTVAGMKRLLRAVGDAEDAEATPVGAVGLLRATRDPEVKAGLGYLVALAAALGAGTDAE
C441_RS15260 SUF system NifU family Fe-S cluster assembly protein 125064 124636 MGIGSDMYRQQIMDHYKNPRNHGRLESPTFTHTGENPSCGDTITMDVQLADDGETIERVAFSGDGCAISQASASMLTQRLPGTTLSELEAMDTDDITEMLGVDISPMRIKCAVLARQVAQDGAKIHDGEIEIEQTKTEDGDD
C441_RS15265 mechanosensitive ion channel 125724 125137 MRQSLTGFLVETLRETVAEFGAAVQDAAPKVLTAVVFLALAYVGIRALLFVVRGVLDGLYPEEQDLVVELGVAVAGVFLWFGAALALLNIVGMTEIAASLGTATGFIALGVSYALSNMIADTVAGVYLLRDPDFNPGDRVKADPVTGTVSSIELRKTRFENDEGDTVVVANRDVEKKWTKYDAPAAEDASASDAT
C441_RS15270 cysteine desulfurase 127079 125799 MRVQESYPVDVDAIRADFPILQRKVGGDISTPGEHDDDDTPLVYLDNAATSHTPEQVVDAIVDYYHGYNSNVHRGIHHLSQEASVAYENAHDRVAEFIGASGGREEVVFTKNTTESMNLVAYAWGLDELGPGDSVVMTEMEHHASLVTWQQIAKKTGAEVRYIRVDDDGRLDMEHAKELIDDSTKMVSVVHVSNTLGTVNPVSELAEMAHEVGAYAFVDGAQSAPTRPVDVEAIDADFFAFSGHKMCGPTGIGALYGKRDILDEMQPYLYGGEMIRSVTYEDSTWEDLPWKFEAGTPPIAQGVGFAAAVDYLDDIGMENVQAHEELLAEYAYDRLSEFDDIEIYGPPGDDRGGLVAFNLDSVHAHDLSSILNEQGVAIRAGDHCTQPLHDKLGVAASTRASFYIYNTRAEIDALVDGIDAARELFA
C441_RS15275 PAS domain-containing sensor histidine kinase 127248 128294 MSNATSSSLAFSSLDALPTQLAILDEEGVIVYTNRAWREFGNEHGYQGDSSSVGMNYLGVCDLSGATDATTASEGIRAVIDGDREAFSFEYPCETPDERLWFTMRATRFVDGGDTYVQIAHLDITDRKRAELEAEEKAERLQNLARMLSHDLRNPLSVAVGYVDSLLDEVVDADGLEQVAGALDRMDDIITDGLVLARHDTVGELSTVDLETRAATAWAHVDTGSAELVVADSFEFRADPSLLGHVFENLFRNSVEHGANDGGLTVTVGVLGGDGDATGFYVEDDGVGIAAEARDEVFEAGYTTGEEGTGLGLSIVAQAVRAHDWDVAVTEAERGGARFEITDVERPA
C441_RS15280 DUF424 family protein 128641 128339 MLLRERETPEGLLVSVCDPDCIGETYTDDGVSLDVTEDFYGGDEAETADEDAVVDSLTRATVANIVGEESVSVAIDAGLVDEETVLEVGGTLHAQLLWLR
C441_RS15285 tetratricopeptide repeat protein 129382 128642 MTDDGKRPHRFSEGQGFDEDYSDFSLDPPELSVDPTKVDPVDSRVLTDILDQRNIDPEDVDIEKLLDVALSYMQINRFEEATGAFERVARFAGDDDDIAQEAWVNKGAAHGQLEEWDEAIGSYQEALKLDDDSEHAATAHTNLAYALWEAGQTADALDHAERAVELDPRFPQAWYNRGFFLHERGLNEDAVNAFDNAIRLGMRTPGVHEEKARALEELGRDEEAEQAQQQADDLREQAEAELVEEY
C441_RS15290 histidine phosphatase family protein 129474 130088 MSAPPTTVVSFVRHAHAPSVPDAERARGLSRRGRRDAARVTARLADLADVVATSPAERARATVEGVADAADAPLIVDDDLRERELAESPVEEFDEAVEHLWANPNASHPGGESHAEAQARGVAAVGRLVEAYPDRHVVVGTHGTLMALVFNAYDPRYGREFWTGLTTPDVYEVTFVDGEAFSIARTWTPEADDRADADRDPAER
C441_RS15295 RNA 2',3'-cyclic phosphodiesterase 130115 130669 MRCFLAVDLPDSLAAGVAAVQDRLSDADGLRFTDPERAHVTLKFLGEVSPDRIAEVEDAVESAVEDAGVGPFDASVGGLGVFPSLDYIRVVWVGVGDGAAELTQLHEAIERETTALGFDPENHDFTPHVTLARMDDARGKVLVRRVVEDESSTVGSFRVREVRLKKSDLGPDGPEYETVTRFSL
C441_RS15300 DMT family transporter 131617 130676 MSRAKTAVAFLALSAVWGSAFLATDIALRTVPPAFLGAVRFDVAALLLFAVAVARGDRVIPAVRDEWRPILAGGAFSIGAHHALLFSGQVYVPGSVASVLLGIIPLATPTLTRLTATRERLSPHRVVGLVLGFLGLVVIANPDPGNLLSSNLVGAVLVFGSAVAFALGAVLTHDSETGMSLLAVQAWMMLVGAVTLHVTSAALPWESAADAAWTQTTLVAAGYLAVVAGAGGFLLYFWLLDRVGPIEVSLLEYVIPLFAALADWTVLGRVPTRATVAGFALIFAGFLLFKRDVIRGELRRVVDRRRGRRPTDD
C441_RS15305 50S ribosomal protein L39e 131790 131942 MGKKSKSKKKRLAKLERQNSRVPAWVMLKTDMEVTRNPKRRNWRRSNTDE
C441_RS15310 50S ribosomal protein L31e 131946 132224 MSANDFEERVVTVPLRDVQAVPAHERAGRSMTLIREHLAKHFKVDAENVRLDTQINEDIWAHGRQSPPSKFRVRAARFDEDGESVVEAEPAE
C441_RS15315 translation initiation factor IF-6 132228 132893 MLRASFAGSSYIGVFARATDDALLVRPDADDSLAEQMAEELDVPLVKTTVGGSGTVGALATGNENGLLVSSRATSREKEALTDAVDLPVYELPGRINAAGNVVLANDYGAYVHADLSDEAVAAVEEALEVPVERGDLADVRTVGMAAVANNRGVLCHPKSREPELEALEALLDVRADIGTINYGGPLVGSGLVANDEGYIVGEDTTGPELGRIEDALDYID
C441_RS15320 50S ribosomal protein L18a 132975 133151 MSQFIITGSFTSRGVVHEFTKTVEAPNENVAQERAFSLIGSEHGIKRTKVELNEVSAA
C441_RS15325 prefoldin subunit alpha 133151 133621 MGGGQQQLQQLSQELQAIDEEIESLESEVSDLNTEKNEIDEAVEAIETLETGSTVQVPLGGDAYLRAEVQDIDEIVVSLGANYAAEQEQSTAIETLRRKQDALDEEIASVRGQIEELEEESDEIEEQAMQAQQQMQQQQMQQMQQMQGEGGDGDDE
C441_RS15330 signal recognition particle-docking protein FtsY 133648 135006 MFDGLKKKLNRFRNDVEETAEEKAEAAADEAEADADAEAESAPAGAEHAAVEPEASEPADVDSESDAVGDADAGSEADAVGDAPADAASASAAVESESDSEAAATPEPDSEVDADTDTDADAGDGPAADEAEPRESLASDAAKAALTEEDDDDSSGPGRLRRAAAFATGKVVIEEEDLEDPLWELEMALLQSDVEMQVAEEILETIREKLIGETRKQVQSTGQLVSEALHDALYEVISVGQFDFDQRIAEADKPVTLIFTGINGVGKTTTIAKLARYFEKQGYSTVLANGDTYRAGANEQIREHAEALGKKLIAHEQGGDPAAVIYDGVEYAEAHDIDIVLGDTAGRLHTSNDLMAQLEKIDRVVGPDLTLFVDEAVAGQDAVERARQFNDAAAIDGAILTKADADSNGGAAISIAYVTGKPILFLGVGQGYDHIEKFDPEQMVNRLLGEDE
C441_RS15335 LysE family translocator 135659 135042 MTTLAFGLDVATYLAFCGAAVALILAPGPDTMYVLARGLDGRGPGVRSAFGIATGVLFHTLLATVGAAALFRAVPEAAAALKYVGAAYLGYLGVAALRNDEFDPAVETERGASFRRGVLVNALNPKVALFFLAFLPGFAGQGAGSGVRMASLGATYAALTALYLSVVALGADRLGARLAETRVASALNYVGAGTMLLLGVVLVLE
C441_RS15340 D-2-hydroxyacid dehydrogenase 136667 135693 MTTILVLDRPTHGIPASEYAAALRERLPDATVRHPKTSAETLDAARDATVITGTSLPDDVLDAADELRLFAGATAGYDHLPLEALRERGVVVTNASGVHGPNIAEHVLGWLLMITRRLDEGLRRQRRREWRRFQSYGELQGSTATVVGLGAIGRAVVERLDAFGVETVGVRYTPEKGGPTDEVLGFDDLEPALVRTDFLVVACPLTDETRGLIDSRALEALPTHAVLVNVARGGVVDTDALVSNLRDNRLRAAALDVTEPEPLPEDHPLWGFENVYITPHVSGHTPHYWTRVADILAENVERLAAEPEDETPALRNQVVPSPNP
C441_RS15345 signal recognition particle protein 136758 138152 MVLDNLGSSLRGSLDKLRGKSRLDEDDVQEIVKEIQRSLLSADVDVSLVMDLSDSIKTRALEEEPPGGTNARDHVLKIVYEELVDLIGESTEIPLESQTIMLAGLQGSGKTTTSAKMAWWFSKKGLRPAVIQTDTFRPGAYDQAKQMCERAEVDFYGDPDCDDPVQIAREGLEATEDADVHIVDTAGRHALEDDLIDEIEEIEGVVQPDLNLLVLDAAIGQGAKEQAQQFDESIGIGGVVITKLDGTAKGGGALTAVNETDSSIAFLGMGETVQDIERFEPNGFISRLLGMGDLKQLSERVERAMSETQAEDEDWDPEEMMKGNFTLKDMQKQMEAMDKMGPLDQVLDMIPGFGGGIKDQLPDDAMDVTKDRMRAFEVIMDSMTEAELENPRKVGASRVKRIAQGSGQDEETIQELLEQHRMMEQTIKQFQNMGDGDMQRMMKKLQNQGGGGGGMGGLGGMGPF
C441_RS15350 divalent cation transporter domain-containing protein 138254 138835 MTVRDVAVEAYKEALPALAASLVGGLIAGVVLGGMRAELRAVPGLLVLVPALLATRGNVYGSLGARIATALHQGLIEPRVTGGDERLRAAATAALANGVLTSTFAAVVAFFLLTFLGSRVAPLPILVGVAIVAGLLSGIVLTVVVVSVVFAGYRRGRNPDTIVGPVVTTTGDVFGVLFLLIAVRTVLAVAGVF
C441_RS15355 divalent cation transporter 138836 139396 MPTEWTVRAITRAMLPVLLVLTLVELGSGLVLGSFEAQLLRYPSLLVLVPVTIGTAGNLGSVLAARLSTSFHLGTLSFSPRDDELAGNAVATVALAVTVFPVIGAGAWVATLLVSGDTSLALQKVVLIALSSGISLAVLAVAVTFSATYVAYRFGLDPDDVVIPVVTNVCDVLGVVVLFGVAQVLV
C441_RS15360 hypothetical protein 139474 140523 MQSSAVLPVLFEVLPRVARIAAYIAVGVFAANLVVAFGLVERIAGLSRYLTSPANLPDEVGTAIVTTAASTTAGYGMLAEFRESGVLDDRATLVAVTINTFFGFVQHIFTFYWPVLIPILGREVGFMYVGARAAIALAITATGVVAGAVLLSDRNTTPVAVAETDGSGGAVDAAAADGAGGGDAADSSDAADSSDAADTDDTLRETVEDAARSTWDKLRRIVPRLAVVYVAVTLLLRTTDLESFAALASPLTNLVGLPGAAVPVVVAFAFDTTTGAATIAPAVGETFTPKQAVATMLIGGIISFAVSTFKRSIPFQYGIWGPEFGSKVIAVNTGLKIVFIGVAVALLVA
C441_RS15365 hypothetical protein 140982 140566 MIYSVDVRIEVPVRDTEVTDRVGDAVENIFPGVELDHEPGKLVGETHELERFSERLHEQAILDTARREFAKRRDDDGFSFALKKQAAFKGVVNFSVGNPDELGDIDVHVTVRDPSVDEVIDYIAPPTEDGRPVDPNGR
C441_RS15370 flagellar hook-basal body complex protein FliE 141554 140985 MRVIGTVGLPGSGKGELAEVARNAGVPVVTMGDVIREECRERGLDPATDHGKIATALREEEGDDAIAARSLPMVEDLLESNDVVVVDGLRSGVELDRFREAFGDDFVLVSVEAPFETRAERLLDRDRDDSDTDVEALKKRERRELDFGMGDAMDRADVVIQNTGTLAEYRETITRLFEEGPEALAEADA
C441_RS15375 hypothetical protein 141683 142060 MSQRSFVVRALWFVFVGWWATPAVVNVAWFLNATVIGIPLGVALINLVPTVLSLKEPKTRLDPDSGRGQRSLVVRAVYFVFVGWWLSWLWANVAVLFTLTIIGLPVGIWMLNRLPAVTSLYRFDG
view raw out.tsv hosted with ❤ by GitHub

ADD COMMENT
0
Entering edit mode

Thank you very much. But for 100,000 files this is not feasible.

ADD REPLY
0
Entering edit mode

But for 100,000 files this is not feasible.

a loop ?

ADD REPLY
0
Entering edit mode

Sorry, you are right. I did not pay attention. Maybe your solution is faster than mine. I'll test it. Thank you again.

ADD REPLY
0
Entering edit mode

Could you provide your awk code, please?

ADD REPLY
1
Entering edit mode

something like (not tested):

awk -v B=123  -v E=456 '{x1=int($2);x2=int($3);b=x1<x2?x1:x2;e=x1<x2?x2:x1; if (!(b>int(E) || e<int(B))) print; }' input.tsv
ADD REPLY

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6