In the "genbank file" posted on ncbi, "Change region show" allows features of a specific location to be extracted into a gbk file.
In addition, the positions of the extracted features count from 1.
EX) raw. gbk - gene A position: 345612..352112
select.gkb - gene A position: 1..6500
Can I use my gbk file to run this function with Python? Or can you tell me about such a tool? I'm asking because I can't find it no matter how much I look for
Thank you for reading it. Have a nice day
Is there a way to search so that we only need to supply two txt files, one for the file's ID/name and the other for the start and stop of that gene? For example, if I want to create gbk files for 20 genes of interest out of a lot of complete genome gbk data files and each gbk files have those genes .
Not by default. You will need to use a way of feeding the efetch command with the three variables. For example if you had a file with the three fields separated by tab then you could do
This will produce a separate file for each interval.
Hi Genomax Sir, i have 60 genes and i want to create gbk file for that only for example- desired genes are present in every genomes and i have 300 e.coli genomes so i want to create 300 gbk files for those 60 genes only rest genes are not required. does is possible to do that? Please reply its will be very helpful for..
Thanks a lot for your time!
It is possible as I show above. You will need three pieces of information per interval you want to retrieve. Accession, star and stop.
Hi Genomax Sir, first of all thank you for your time. I have checked above script regarding gbk file, actually it work but at a time it extract one gene in gbk file however i want to make gbk file of all of the gene of interest. i mean all those 60 genes should be present in each and every 300 gbk files. Does it possible to make gbk file from local ncbi annotated gbk file not from ncbi annotated data?
Thank you!
Then you need to make files for those 60 genes with coordinates for each genome.
If you have a properly formatted GenBank file locally then you may be able to use this (or modify as needed): Slicing Genbank File by 'gene_id' range
Thank you Genomax sir from the bottom of my heart, and appreciate all you have done. Your help means a lot for me. seriously i was stuck from past 2 week but now its all fine. Thank you again sir for your time.