Explanation Of Novoalign Out.Sam
1
0
Entering edit mode
11.6 years ago
Raghav ▴ 100

hello every one, will any one please help to interpret output of NovoalignV2.07 which I got by running command like $novoalign -c 4 -f /storage/home/cdac/raghav/bowtie2-2.1.0/SRR681004.fastq -d TAIR -o SAM> out.sam I am unable to find at which position my query sequence get aligned with respect to my reference genome.

@HD VN:1.0 SO:unsorted @PG ID:novoalign PN:novoalign VN:V3.00.03 CL:novoalign -c 4 -f /storage/home/cdac/raghav/bowtie2-2.1.0/SRR681004.fastq -d TAIR -o SAM> out.sam

@SQ SN:1 LN:30427671 AS:TAIR

@SQ SN:2 LN:19698289 AS:TAIR

@SQ SN:3 LN:23459830 AS:TAIR

@SQ SN:4 LN:18585056 AS:TAIR

@SQ SN:5 LN:26975502 AS:TAIR

@SQ SN:mitochondria LN:366924 AS:TAIR

@SQ SN:chloroplast LN:154478 AS:TAIR

SRR681004.1 0 5 608601 70 48M52S * 0 0 ATGTGAGCAAAGATTCTAATCTCATTTGTTCTTTTGTTGATTCATTGATTATGTGAGCAAAGATTCCAATCTATATATTTTGNANNAACAAGATAGCTTT BBBDFEFFHHHHHIIJJJIJIJIJJJJJIIJJJJJIIII@FHHHIIHFHICCCFFFFFHHHHHHIGHIJIFIHIGIFJIJJI!2!!1:CGIJIHGIFGII PG:Z:novoalign AS:i:352 UQ:i:352 NM:i:0 MD:Z:48 SRR681004.2 16 5 24831243 70 49M51S * 0 0 ACGAAGAAACGTTTAGAAGACAGAGACGACGGCTCATGAAGCGTACAAATTTAGTCAAGACTCTGCATTGTCAAGCTAGCTTCTTGTTAAAGACGATTTT HHGIJJJJJIGHFGIJIIIIIHEIIJJJJIIIIGFHIHHHHHFFFFFCCCIHHJHGJIIGIJJJJJJJJIIHJJJJJJJJJJIJJJJHHHHHFFFFFCCC PG:Z:novoalign AS:i:340 UQ:i:340 NM:i:0 MD:Z:49 SRR681004.6 4 * 0 0 * * 0 0 AATATTGTGATTATGATATTTTAGTTTTAAACTAAACATTTAGTTTATGAAAGGGAAAGAAGCCGTAGCTAGTCGTAGAAGTCAGTGAAGGGGAACAAGC @;@DFFFDHHFF?GDGIGGIII@GFHICHCADDHIGGGGHIBAGHHHII>@@@D7DDDFHBFDEDHGGFF@EHGHD9DA4?DD>DGHFCGDGIB?<feg2 pg:z:novoalign="" zs:z:r="" nh:i:3="" srr681004.7="" 4="" *="" 0="" 0="" *="" *="" 0="" 0="" attatgaaccttcttatagagttacctgagttttctttaggtttagaccgattatgaaccttcttatagagttacctgagtnnnnnnnnnntttagacca="" 11144222="=DDFFGF4&lt;&lt;:?,&lt;3CEAB??++&lt;+1::C?:CCF9F9*0*)BCCFFFFFHHHHHIJIJIIIJJJJJJJJHJH!!!!!!!!!!00?DGHIII" pg:z:novoalign="" zs:z:nm="" srr681004.8="" 4="" *="" 0="" 0="" *="" *="" 0="" 0="" aatctggaccttattccgttcacagaatcttcggtggcaacatgctagctgtttgaagacccgagaaggaggaggaggagggnnnnnagatgggggacgt="" @@cfffd?hffbfhgghggeef="">HHIIIICCEGHEDGGGH?HIHFIC*BF::?D=DD??BAF??@FHHFIIEEBG11?BGG)!!!!!--5;C@=A=65=> PG:Z:novoalign ZS:Z:NM SRR681004.3 0 3 18548550 70 49M51S * 0 0 TAGTTATAGTCCANATATATAATTAATAATCTTATGTGCACCAAAATGGTTAGTTATAGTCCAAATATATAATTAAAATTACCTTCAACAGCTCATAGAT =??DDFFFHHHHH!2<cggiiiijiijjjihhjijijjjjhiijjiiijfb@bdffffhhhhhjiihigjhhjjgjjjijhhhhiijjjjjjiijhhiii pg:z:novoalign="" as:i:346="" uq:i:346="" nm:i:1="" md:z:13a35="" srr681004.10="" 0="" 3="" 19960991="" 70="" 50m50s="" *="" 0="" 0="" acactgtgacagaagacataagagtcattggagccaaaaatctctattttaggactctttaagagtcctctaaccttctacgnanncgggaagggaaaag="" bccfffdfdhhhgighigijidchfhiiiijhhiiafhjjjjjgijjjjj?@bddeffhhffhijghihhjiigghjghchh!1!!12?f??dhichich="" pg:z:novoalign="" as:i:358="" uq:i:358="" nm:i:0="" md:z:50="" srr681004.11="" 16="" 3="" 21154934="" 70="" 50s50m="" *="" 0="" 0="" accaagggcttctannangaagaagctagcttattgggctcaatgtgagttcagcaatcccaacggcaatcgcctcacccttggtagtcataagaacgac="" iighfd;hhffc:1!!1!gihhgghhehbjhdhigdf<dhfhffdbbb@?4d@@3c="9IIIHGHB">HEHBIIDGIHFIGF<bghfhehdhhhd?ddd?@@ pg:z:novoalign="" as:i:358="" uq:i:358="" nm:i:0="" md:z:50="" srr681004.12="" 0="" 5="" 18183313="" 70="" 50m50s="" *="" 0="" 0="" aggatcatcattgcaacgaacgatatctcctcattgaagggtctcgttcagacttatcgatgtcccgtccaagtataaagctngnntcatgaaatccacc="" cccfffffhhhhhjjjjjjjjijjjjjjijjjjjjjhjjjjfhhiehijgcccfffffhhhhhjjjiijjjjjjjiiijjjj!1!!10?fghiiijijjj="" pg:z:novoalign="" as:i:358="" uq:i:358="" nm:i:0="" md:z:50="" srr681004.9="" 4="" *="" 0="" 0="" *="" *="" 0="" 0="" actcaatcatacacatgacaacaagtcatattcgactctaaaacactaacaagcaagaagaaggttggttagtgttttggagnnnnntatgacttgatct="" cccfffffhhhhhjjjjjijjjjjjijjjjjjjjjjjjjiijjjjjijjjcccfffffhhhhhjjijjjjhihijiijijfh!!!!!00?fhijjihghh="" pg:z:novoalign="" zs:z:nm="" srr681004.13="" 4="" *="" 0="" 0="" *="" *="" 0="" 0="" aatttgagtatgcatcccgtctgtgcgatgaggccagtacccttggttgtcaccactaattagctatgtttgacaaacggtaagatcttcgccttcaaat="" @bbfffefhhhhhjjijjjjjjjjjjjjjjjjjjjjjjjjjjijjjjhiib@cfffefhfhhhjijjjjjjihhjjjjjjjjjjihhhidhhihiiiiih="" pg:z:novoalign="" zs:z:r="" nh:i:2="" srr681004.15="" 4="" *="" 0="" 0="" *="" *="" 0="" 0="" catcttgcaatcggaagctagcttgtgctaaacttgatcttttgtacttgagcttatctgtcagaagctagcttaaagacaaaacaacagggcatggacc="" cccfffffhhhhhjiijijijijjihjijfhiiijjgiijjjjiiijjjibbcfffffhhhhhjjjjjiiijiijjjjgiiijijjijjjjjjjjijjii="" pg:z:novoalign="" zs:z:nm="" srr681004.14="" 0="" 4="" 1557383="" 70="" 48m52s="" *="" 0="" 0="" attacacaaatagttcaaaaaagaaaagacagctcgacttctgactgaaaattacacaaatagttcaaaattttgtagagtagtcctctagttttgagaa="" bccfffffhhhhhjjjjjjjjjijjjjjjjjjiijjijijjjjijiiiigcccfffffhhhhhjijjjjjjijjhiijjjhijjjjhjjijjjijigiij="" pg:z:novoalign="" as:i:340="" uq:i:340="" nm:i:0="" md:z:48="" srr681004.4="" 16="" 1="" 5866969="" 6="" 49s51m="" *="" 0="" 0="" gggggggtttatttctcttgatcatgatctatatttctcttctttattgcaaagacgaaaagaatgttaatggtttaatggttagaaattcggtctttac="" ihf:?39higijjiihiiijjjjjjjjijjjjijjjjhhhhhfffffcccjihijiijjjiijihjijjiijjjjjjjjjjjjjjjjhghhhfffffccc="" pg:z:novoalign="" as:i:334="" uq:i:334="" nm:i:0="" md:z:51<="" p="">

thank you in advance.

sam • 3.9k views
ADD COMMENT
0
Entering edit mode
11.6 years ago
Raghav ▴ 100

hello,

In SAM output Each line is a collection of at least 12 fields separated by tabs; from left to right, the fields are: 1.Name of read that aligned. 2.Sum of all applicable flags. Flags relevant to Bowtie are: 1 The read is one of a pair

2 The alignment is one end of a proper paired-end alignment

4 The read has no reported alignments

8 The read is one of a pair and has no reported alignments

16 The alignment is to the reverse reference strand

32 The other mate in the paired-end alignment is aligned to the reverse reference strand

64 The read is mate 1 in a pair

128 The read is mate 2 in a pair Thus, an unpaired read that aligns to the reverse reference strand will have flag 16. A paired-end read that aligns and is the first mate in the pair will have flag 83 (= 64 + 16 + 2 + 1).

Name of reference sequence where alignment occurs

1-based offset into the forward reference strand where leftmost character of the alignment occurs

Mapping quality

CIGAR string representation of alignment

Name of reference sequence where mate's alignment occurs. Set to = if the mate's reference sequence is the same as this alignment's, or * if there is no mate.

1-based offset into the forward reference strand where leftmost character of the mate's alignment occurs. Offset is 0 if there is no mate.

Inferred fragment length. Size is negative if the mate's alignment occurs upstream of this alignment. Size is 0 if the mates did not align concordantly. However, size is non-0 if the mates aligned discordantly to the same chromosome.

Read sequence (reverse-complemented if aligned to the reverse strand)

ASCII-encoded read qualities (reverse-complemented if the read aligned to the reverse strand). The encoded quality values are on the Phred quality scale and the encoding is ASCII-offset by 33 (ASCII char !), similarly to a FASTQ file.

Optional fields. Fields are tab-separated. bowtie2 outputs zero or more of these optional fields for each alignment, depending on the type of the alignment:

AS:i:<n> Alignment score. Can be negative. Can be greater than 0 in --local mode (but not in --end-to-end mode). Only present if SAM record is for an aligned read.

XS:i:<n> Alignment score for second-best alignment. Can be negative. Can be greater than 0 in --local mode (but not in --end-to-end mode). Only present if the SAM record is for an aligned read and more than one alignment was found for the read.

YS:i:<n> Alignment score for opposite mate in the paired-end alignment. Only present if the SAM record is for a read that aligned as part of a paired-end alignment.

XN:i:<n> The number of ambiguous bases in the reference covering this alignment. Only present if SAM record is for an aligned read.

XM:i:<n> The number of mismatches in the alignment. Only present if SAM record is for an aligned read.

XO:i:<n> The number of gap opens, for both read and reference gaps, in the alignment. Only present if SAM record is for an aligned read.

XG:i:<n> The number of gap extensions, for both read and reference gaps, in the alignment. Only present if SAM record is for an aligned read.

NM:i:<n> The edit distance; that is, the minimal number of one-nucleotide edits (substitutions, insertions and deletions) needed to transform the read string into the reference string. Only present if SAM record is for an aligned read.

YF:Z: String indicating reason why the read was filtered out. See also: Filtering. Only appears for reads that were filtered out.

YT:Z: Value of UU indicates the read was not part of a pair. Value of CP indicates the read was part of a pair and the pair aligned concordantly. Value of DP indicates the read was part of a pair and the pair aligned discordantly. Value of UP indicates the read was part of a pair but the pair failed to aligned either concordantly or discordantly.

MD:Z: A string representation of the mismatched reference bases in the alignment. See SAM format specification for details. Only present if SAM record is for an aligned read.

ADD COMMENT

Login before adding your answer.

Traffic: 1732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6