is there a post to summarize the attribute tag (optional field) of tophat output bam file?
1
0
Entering edit mode
9.5 years ago
zju.whw ▴ 70

Is there any post/manual/blog to summarize the attribute tag (optional field) of tophat output bam file? How many attribute tags does tophat output bam file have? What is the meaning of each tag?

I run the tophat for HiSeq2000 paired-end 2*100bp strand-specific, with the option --library-type fr-firststrand. The output 13 tags is as below, are there any other tags? Is it right for my understanding?

  • AS:i: alignment score generated by aligner
  • CC:Z: reference name of the next hit; "=" for the same chromosome
  • CP:i: leftmost coordinate of the next hit
  • HI:i: query hit index, indicating the alignment record is the i-th one stored in SAM
  • MD:Z: string for mismatching positions.
  • NH:i: number of reported alignments that contains the query in the current record.
  • NM:i: edit distance to the reference, including ambiguous bases but excluding clipping
  • XG:i: the number of gap extensions, for both read and reference gaps, in the alignment.
  • XM:i: the number of mismatches in the alignment
  • XN:i: the number of ambiguous bases in the reference covering this alignment
  • XO:i: the number of gap opens, for both read and reference gaps, in the alignment.
  • XS:Z: if either fr-firststrand or fr-secondstrand is specified, every read alignment will have an XS attribute tag as explained below.
  • YT:Z: value of UU indicates the read was not part of a pair. Value of CP indicates the read was part of a pair and the pair aligned concordantly. Value of DP indicates the read was part of a pair and the pair aligned discordantly. Value of UP indicates the read was part of a pair but the pair failed to aligned either concordantly or discordantly. Filtering: #filtering

The resource I have read: https://ccb.jhu.edu/software/tophat/manual.shtml

The above description of "XS": http://samtools.github.io/hts-specs/SAMv1.pdf

The above description of "AS", "CC", "CP", "HI", "MD", "NH", "NM": http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#bowtie2-build-opt-fields-as

The above description of "XG", "XM", "XN", "XO", "YT": I don't know if the tag has the same meaning in bowtie and tophat output bam.

bam RNA-Seq Tophat • 5.4k views
ADD COMMENT
2
Entering edit mode
9.5 years ago

Yes, the X? tags described on the bowtie2 page have the same meaning in tophat2. In reality, those are being produced by bowtie2, since that's what tophat uses.

ADD COMMENT
0
Entering edit mode

Thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 1734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6