Entering edit mode
8.9 years ago
VS
▴
740
I ran a trial miniasm assembly for my plant genome with about 15x PacBio reads. The output GFA file has some segments (S) and linker(L) lines but also some tags that I could not find documentation for. These were - 'x' and 'a' .. please see example below. Can anyone point me out a resource with explanation for these?
a utg000002l 0 m151019_100115_42145_c100918202550000001823208705121657_s1_p0/147200/8874_24354:47-15441 + 635
a utg000002l 635 m151015_113223_42145_c100918002550000001823208705121613_s1_p0/40967/2170_19160:95-16938 + 2721
a utg000002l 3356 m151018_210841_42145_c100918202550000001823208705121654_s1_p0/39933/0_19118:207-19111 - 3468
a utg000002l 6824 m151014_072639_42145_c100917702550000001823208705121615_s1_p0/24776/0_17126:39-17100 + 17062
x utg000002l 23886 4 2 2 m151019_100115_42145_c100918202550000001823208705121657_s1_p0/147200/8874_24354:47-15441
+ m151014_072639_42145_c100917702550000001823208705121615_s1_p0/24776/0_17126:39-17100 -
Did anyone ever find the answer to this? I too have miniasm output with the same a and x rows.
Yes, The 'a' tag is for the golden path where an entire read is contained within a unitig, while the 'x' tag gives a brief summary of each unitig which can also be inferred from 'S' and 'a' tags.
You can check this in the detailed options given in file miniasm/miniasm.1 (where miniasm is your installation folder). Here is the github edit -- https://github.com/lh3/miniasm/commit/b531ef01c34208930187d98da0a8d2fdd2b8b9b4