Bowtie out put interpretation
0
1
Entering edit mode
5.5 years ago
bioz ▴ 20

Hai, I did bowtie using theses commands

bowtie --best -v 0 -a -n 0 --un unalig.fa database -f control.fa > out.bowtie

I got Result like this

5856 +  osa 0   TCGGACCAGGCTTCAATCCCT   IIIIIIIIIIIIIIIIIIIII   8   
5592 +  zma 0   TCGGACCAGGCTTCAATCCCT   IIIIIIIIIIIIIIIIIIIII   8   
5516 +  zma 0   TCGGACCAGGCTTCAATCCCT   IIIIIIIIIIIIIIIIIIIII   8   
8559 +  sbi 0     TCGGACCAGGCTTCAATCCCT IIIIIIIIIIIIIIIIIIIII   8   
2016 +  bdi 0   TCGGACCAGGCTTCAATCCCT   IIIIIIIIIIIIIIIIIIIII   8

I need reads which 100 percentage matched to the reference how to find that ? And what is 7 th column indicates

RNA-Seq bowtie alignment next-gen • 1.1k views
ADD COMMENT
2
Entering edit mode

The bowtie manual has this to say:

bowtie outputs one alignment per line. Each line is a collection of 8 fields separated by tabs; from left to right, the fields are:

  1. Name of read that aligned. Note that the [SAM specification] disallows whitespace in the read name. If the read name contains any whitespace characters, Bowtie 2 will truncate the name at the first whitespace character. This is similar to the behavior of other tools.
  2. Reference strand aligned to, + for forward strand, - for reverse
  3. Name of reference sequence where alignment occurs, or numeric ID if no name was provided
  4. 0-based offset into the forward reference strand where leftmost character of the alignment occurs
  5. Read sequence (reverse-complemented if orientation is -). If the read was in colorspace, then the sequence shown in this column is the sequence of decoded nucleotides, not the original colors. See the Colorspace alignment section for details about decoding. To display colors instead, use the --col-cseq option.
  6. ASCII-encoded read qualities (reversed if orientation is -). The encoded quality values are on the Phred scale and the encoding is ASCII-offset by 33 (ASCII char !). If the read was in colorspace, then the qualities shown in this column are the decoded qualities, not the original qualities. See the Colorspace alignment section for details about decoding. To display colors instead, use the --col-cqual option.
  7. If -M was specified and the prescribed ceiling was exceeded for this read, this column contains the value of the ceiling, indicating that at least that many valid alignments were found in addition to the one reported. Otherwise, this column contains the number of other instances where the same sequence aligned against the same reference characters as were aligned against in the reported alignment. This is not the number of other places the read aligns with the same number of mismatches. The number in this column is generally not a good proxy for that number (e.g., the number in this column may be '0' while the number of other alignments with the same number of mismatches might be large).
  8. Comma-separated list of mismatch descriptors. If there are no mismatches in the alignment, this field is empty. A single descriptor has the format offset:reference-base>read-base. The offset is expressed as a 0-based offset from the high-quality (5') end of the read.

Based on this, I would expect that the last column contains the information about mismatches. Would that make sense for your file?

ADD REPLY
1
Entering edit mode

I added (code) markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

See also http://bowtie-bio.sourceforge.net/manual.shtml#default-bowtie-output

ADD REPLY

Login before adding your answer.

Traffic: 1945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6