Hi, I'm a computer science student and I'm having some trouble understanding one of SynMap's outputs. This is the first time I'm using this tool, so my doubts may be very basic. The output I'm interested in is DAGChainer output in genomic coordinates. I pasted bellow a small sample of the file:
#1 5438.0 a61341_CP009354.1 b61370_CP018835.1 f 114 Mean Ks: 26.8037 Mean Kn: 0.1808
#Ks Kn a<db_genome_id>_<chr> chr1||start1||stop1||name1||strand1||type1||db_feature_id1||genome_order1||percent_id1 start1 stop1 b<db_genome_id>_<chr> chr2||start2||stop2||name2||strand2||type2||db_feature_id2||genome_order2||percent_id2 start2 stop2 eval block_score GEVO_link
3.3861 0.1466 a61341_CP009354.1 CP009354.1||2533113||2533958||cds-AIW14852.1||-1||CDS||3462596172||2299||71.04 2533958 2533113 b61370_CP018835.1 CP018835.1||999503||1000348||cds-ASA55073.1||-1||CDS||3463773201||923||71.04 1000348 999503 1.200000e-161 50
8.9127 0.0410 a61341_CP009354.1 CP009354.1||2534066||2534647||cds-AIW14853.1||-1||CDS||3462596173||2300||77.41 2534647 2534066 b61370_CP018835.1 CP018835.1||1000376||1000951||cds-ASA55074.1||-1||CDS||3463773202||924||77.41 1000951 1000376 5.000000e-143 100
Furthermore, there are other headers as follows (I omitted the actual data):
#1 2729.0 a61341_CP009354.1 b61370_CP018835.1 r 60 Mean Ks: 29.3942 Mean Kn: 0.1974
#1 367.0 a61341_CP009354.1 b61370_CP018836.1 f 8 Mean Ks: 42.0974 Mean Kn: 0.2392
#1 450.0 a61341_CP009355.1 b61370_CP018835.1 f 9 Mean Ks: 26.1880 Mean Kn: 0.1268
#1 391.0 a61341_CP009355.1 b61370_CP018835.1 r 9 Mean Ks: 38.5121 Mean Kn: 0.2237
#1 558.0 a61341_CP009355.1 b61370_CP018836.1 f 12 Mean Ks: 31.9494 Mean Kn: 0.2921
#1 464.0 a61341_CP009355.1 b61370_CP018836.1 r 10 Mean Ks: 58.8696 Mean Kn: 0.3136
I put the repeated headers, there are others (#2
, #3
, etc). Although I've already read the documentation regarding this particular output, still I don't understand what it means. As far as I know, each header starting with #
indicates a synteny between the two chromosomes. The docs say that these can be repeated, and I don't understand what these repeats mean. Probably I'm missing something because I may lack some biological knowledge. Furthermore, could you explain what each row mean? The file itself already has a description for the columns, but I didn't quite get what each row mean.
Sorry if the question is too basic. As I don't have a background in biology, I'm in need of some assistance to interpret the output. Thanks for any help!