Help with IGV abbreviation
1
0
Entering edit mode
6 months ago
GeneC • 0

Hi,

Can someone explain what the following abbreviation mean flag 99, mq 60, cigar 88M23D62M, c1 0,5, MD 88 CACC'XI 1, NM 23, AS 121, PS 11297, rd esp the following 1:2114:12111:13792]

Thank you

Details for a variant

igv • 595 views
ADD COMMENT
1
Entering edit mode

Those are SAM format fields. Check section 1.4 here: https://samtools.github.io/hts-specs/SAMv1.pdf

1:2114:12111:13792

That is part of the fastq read header. Illumina read headers are described here: https://en.wikipedia.org/wiki/FASTQ_format#Illumina_sequence_identifiers

ADD REPLY
0
Entering edit mode

Thank you for the link and details

ADD REPLY
1
Entering edit mode
6 months ago
Mathew ▴ 160

Hi, here is a breakdown of each part that you asked about:

flag 99: This indicates various properties of the read alignment. In this case, a flag value of 99 typically means that the read is mapped to the reference, is part of a pair, and both reads in the pair are mapped in the same orientation.

mq 60: This is the mapping quality score, which represents the likelihood that the alignment is incorrect.

cigar 88M23D62M: The CIGAR string describes how the read aligns to the reference genome. In this case, the read has 88 matches (M), followed by a deletion of 23 bases (D), and then 62 more matches.

c1 0,5: This indicates the coordinates of the alignment on the reference genome. In this case, it starts at position 0 and extends for 5 bases.

MD 88 CACC'XI 1: The MD tag provides information about mismatches and deletions in the alignment. It indicates that there are mismatches at specific positions, with 'CACC' mismatches at position 88.

NM 23: NM stands for "edit distance," which is the minimum number of changes (substitutions and indels) required to change the read sequence into the reference sequence. In this case, there are 23 differences between the read and the reference.

AS 121: AS is the alignment score, representing the sum of the alignment scores for the alignment.

PS 11297: PS stands for "position score," which is the score of the mate.

rd esp the following 1:2114:12111:13792]: This might be additional information about the read, like a header.

Please see GenoMax's link that he provided in the samtools GitHub, and usually refer to these documentations first when you have questions about what your output means

ADD COMMENT
0
Entering edit mode

Thank you very much for helping with the details.

ADD REPLY

Login before adding your answer.

Traffic: 2163 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6