I think this post is very relevant to many bioinformaticians who are submitting to Genbank using tbl2asn
I've been following the guidelines here
I've successfully installed tbl2asn on my Mac and have been using through the terminal
The directions say to create 3 files: template.sbt
, table.tbl
, and fasta.fsa
My fasta format headers look like:
>TCONS_00001810 [organism=Mus musculus] [strain=C57BL/6J] [chromosome=1] olfactory receptor 1415 (Olfr1415) mRNA, complete cds
The corresponding data in the table file looks like:
>Feature TCONS_00001810
1 3422 mRNA
1 186 5'UTR
187 1122 CDS
1123 3422 3'UTR
1 176 exon
177 2079 exon
The template file isn't a text file so I can't provide an example . . .
In terminal, I've opened up tbl2asn and I know it's working because when I do the command:
tbl2asn -
it gives me all of the different commands that I can use.
When I run this command, it works and creates a file in my directory with the template.sbt
, table.tbl
, and fasta.fsa
called errorsummary.val
. However, this file is empty (zero bytes). It should create a .sqn
file which combines the 3 preliminary files i've described earlier.
tbl2asn -t template.sbt -p . -j "[organism=Mus musculus] [strain=C57BL:6J]" -V vb -a s
The documentation explains -t
, -p
, -j
, -V
, and -a
-p
specifies the path for the table and sequence files [required]
-t
specifies the template file (including the path) [required]
-j
allows the addition of source qualifiers that will be the same for each submission
Example:-j "[organism=Saccharomyces cerevisiae] [strain=S288C]"
-V
is a verification command when used in conjunction with v (strongly suggested), which will tell the computer to run a validation step to insure that there are no errors in your submission.This validation step will generate a report (with suffix .val) for each .fsa file and place it in the same directory that houses the data files and tables used in the submission.
If you add a b command (optional) following the v command, the computer will generate a GenBank flat file (.gbf) of your submission and deposit it in the same directory that houses the data files and tables used in the submission. Note that .gbf files are not suitable for submission. They are only to view the file in GenBank flatfile format. The
-a
command used in conjunction with the s command instructs tbl2asn to read multiple FASTA components in one file as a set of unrelated sequences. This creates a single file of multiple submissions.
Why does the program run, not give any errors, create the errorsummary.val and not create the .sqn file?
How can I get this to work? I feel like I'm very close.
I've already established a working directory which is where all of those files are located.
I've tried to put the directory location after -p, taking out the -j and modifiers, and [optional] commands. Still can't get it to work.
Please help. This should be useful for anyone submitting multiple sequences at a time to Genbank
What are the contents of the errorsummary.val file? is it empty or does it give any information about what's going on
its completely empty . do you have any idea how to submit this to Genbank? What other reasons could there be for why it isn't working .
hello,
I am trying to prepare files for a TSA submission. I prepared the .fsa and .sbt files. I was wondering if you know how to prepare the .tbl file that contains the annotation?
Thanks in advance
Federico
Also, I'm too getting an empty .val file
Did you ever resolve this issue? I'm having the same problem with a whole genome dataset, and I can't figure out what's going wrong.