Hi everyone, I am dealing with gam file format and trying to understand its structure. I convert my gam files to JSON using vg. I don't exactly understand how the fields work. In vg file format page, the only source I could find, does not give information about the fields. I want to extract the read sequence, alignment position in the reference, cigar string, or equivalent matching details. Which fields are mandatory so I can rely on that name in the code segment? Two of the examples from my files are
{"fragment": [{"length": "-567", "name": "chr22"}], "fragment_length_distribution": "1927:443.135:148.44:0:1", "fragment_prev": {"name": "ERR903030.252281075 "}, "fragment_score": 51.0, "identity": 0.98412698412698407, "mapping_quality": 60, "name": "ERR903030.252281075 ", "path": {"mapping": [{"edit": [{"from_length": 17, "to_length": 17}], "position": {"node_id": "89533395", "offset": "15"}, "rank": "1"}, {"edit": [{"from_length": 32, "to_length": 32}], "position": {"node_id": "89533396"}, "rank": "2"}, {"edit": [{"from_length": 25, "to_length": 25}, {"from_length": 1, "sequence": "C", "to_length": 1}, {"from_length": 6, "to_length": 6}], "position": {"node_id": "89533397"}, "rank": "3"}, {"edit": [{"from_length": 32, "to_length": 32}], "position": {"node_id": "89533398"}, "rank": "4"}, {"edit": [{"from_length": 12, "to_length": 12}, {"from_length": 1, "sequence": "C", "to_length": 1}], "position": {"node_id": "89533399"}, "rank": "5"}]}, "quality": "HiAfISIiEB8PJB8PHQ8lIR0ODg4bJB8PDg8YJCYYIiYgHQ4bDg4OGw4ZDiIiISYmDhkZGR8PDxkiDiQPDyQPHyMOIh4YIg4OGQ4OGQ4YDg4iIiYbIiMiDw0iGiIaJiEODhkWDQ0XFR8NIiMiFSMNFQ0ZHxcODRciIgICAgIC", "refpos": [{"name": "chr22", "offset": "17119477"}], "score": 126, "sequence": "TCCCTGAGGTGGTGGCGGAGGTGGTGGAGGGGCGGAGGGCGGAGCACCGTAGCCCCCTCTGGCCCGACTCGGGGCGGCCCGATTGCCCCGGTCCCAGCAGCCCTCCAGGGCCTCCAGGCCCCGGCC", "time_used": 1221.0}
and
{"identity": 0.90000000000000002, "mapping_quality": 60, "name": "SRR24940081.1.1 M06097:87:000000000-L2HVL:1:1101:14851:1604 length=219", "path": {"mapping": [{"edit": [{"from_length": 2, "to_length": 2}, {"from_length": 1, "sequence": "C", "to_length": 1}, {"from_length": 16, "to_length": 16}, {"from_length": 1, "sequence": "A", "to_length": 1}], "position": {"name": "1452", "node_id": "1452", "offset": "9208"}}]}, "query_position": 96, "score": 2, "sequence": "TACTAATAAAATATGATGTA"}