I got the consensus sequences from a couple of my bacterial isolates. I generated the VCF files, and one has about 83 SNPs and the other over a thousand. I would like to check if these SNPs have affected the coding potential of the strains.
Is there a program to visualize the ORFs of these isolates? I know of Artemis, but the map I get is a bit raw, for I can't see all the ORFs at once, and it is a little informative. In the figure below, I can see a portion of the genome, not all of it and I don;'t know how to fix the view to all the genome:
(Also, how can I load the annotations?)
Is there a way to determine the number of ORFs to see if some of them are missed over the reference genome? I know of Augustus, but I get a textual output like this:
# ----- prediction on sequence number 1 (length = 4598762, name = K-12_Ho) -----
#
# Predicted genes for sequence number 1 on both strands
# start gene g1
K-12_Ho AUGUSTUS gene 70 1404 1 + . g1
K-12_Ho AUGUSTUS transcript 70 1404 1 + . g1.t1
K-12_Ho AUGUSTUS start_codon 70 72 . + 0 transcript_id "g1.t1"; gene_id "g1";
K-12_Ho AUGUSTUS single 70 1404 1 + 0 transcript_id "g1.t1"; gene_id "g1";
K-12_Ho AUGUSTUS CDS 70 1404 1 + 0 transcript_id "g1.t1"; gene_id "g1";
K-12_Ho AUGUSTUS stop_codon 1402 1404 . + 0 transcript_id "g1.t1"; gene_id "g1";
# coding sequence = [atgtggatac...
that is difficult to handle (I'll have to write a script to extract statistics from it). Is there something more straightforward?
Thank you
Why not use IGV, if you have the GTF/GFF files and the reference these were aligned to?
That is going to be the case with any visualization program, is it not?
I was hoping for something a bit more advanced than IGV but I reckon you are right: classic is better