Classify Phage Taxonomy Scaffold Data (Genome Fractions)
0
0
Entering edit mode
10 months ago

Hi,

I am classifying Phages according to their taxonomy using https://github.com/KennthShang/PhaGCN ,I am having the issue that my fast afiles have fractions of the genome instead of the whole genome:

My fasta files have the format Genome1.fasta:

>k141_291006
TCG...
>k141_386008
TCG....

The PhaGCN classifies this phage genome as:

k141_291006,19687,Casjensviridae,0.17071722 
k141_386008,108404,Herelleviridae,1.0

So it gives two different classifications (yes one has probability 1.0 in other cases there isn't one with higher probability). The examples for this tool use whole genomes and classify the genome.

  1. Can I just concat the sequences and classify as an entire genome?
  2. Should I align the against each other using Multiple Sequence Alignment? I though to align against reference but it since I do not know to which family they belong finding an accurate reference genome is hard and not robust method.
  3. Should I classify according to the most probable classification and if they are all very similar like 0.5,0.4 do a consensus?

Best Regards and Thank You

Phages PhaBOX Bacteria Genome PhaGCN • 269 views
ADD COMMENT

Login before adding your answer.

Traffic: 2729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6