Is there any post/manual/blog to summarize the attribute tag (optional field) of tophat output bam file? How many attribute tags does tophat output bam file have? What is the meaning of each tag?
I run the tophat for HiSeq2000 paired-end 2*100bp strand-specific, with the option --library-type fr-firststrand
. The output 13 tags is as below, are there any other tags? Is it right for my understanding?
AS:i
: alignment score generated by alignerCC:Z
: reference name of the next hit; "=" for the same chromosomeCP:i
: leftmost coordinate of the next hitHI:i
: query hit index, indicating the alignment record is the i-th one stored in SAMMD:Z
: string for mismatching positions.NH:i
: number of reported alignments that contains the query in the current record.NM:i
: edit distance to the reference, including ambiguous bases but excluding clippingXG:i
: the number of gap extensions, for both read and reference gaps, in the alignment.XM:i
: the number of mismatches in the alignmentXN:i
: the number of ambiguous bases in the reference covering this alignmentXO:i
: the number of gap opens, for both read and reference gaps, in the alignment.XS:Z
: if either fr-firststrand or fr-secondstrand is specified, every read alignment will have an XS attribute tag as explained below.YT:Z
: value ofUU
indicates the read was not part of a pair. Value ofCP
indicates the read was part of a pair and the pair aligned concordantly. Value ofDP
indicates the read was part of a pair and the pair aligned discordantly. Value ofUP
indicates the read was part of a pair but the pair failed to aligned either concordantly or discordantly. Filtering: #filtering
The resource I have read: https://ccb.jhu.edu/software/tophat/manual.shtml
The above description of "XS": http://samtools.github.io/hts-specs/SAMv1.pdf
The above description of "AS", "CC", "CP", "HI", "MD", "NH", "NM": http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#bowtie2-build-opt-fields-as
The above description of "XG", "XM", "XN", "XO", "YT": I don't know if the tag has the same meaning in bowtie and tophat output bam.
Thank you very much