Bowtie Output "Not Enough Fields"
0
0
Entering edit mode
11.9 years ago
GPR ▴ 390

Hello, I have ran Bowtie with the following command-line: 'bowtie -p 16 -q -n 3 -k 1 -m 1 --best --strata -S BowtieIndex *.fastq >& output.sam &' When running CleanSam.jar to remove overhanging reads, I get the following error: "Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields" I do not get this error when I align my reads with BWA or TopHat Any suggestions on how to fix this one? Thanks, G.

• 6.0k views
ADD COMMENT
0
Entering edit mode

Have you tried skimming through some of your output.sam file to see if something looks blatantly wrong?

ADD REPLY
0
Entering edit mode

I did actually, nothing that caught my eye. This is a snapshot. << HWI-ST974:67:C0545ACXX:2:1101:14321:25195 0 chr2 242170185 255 76M * 0 0 CCATGATTTTGCGAATGGCTTTGCCGCGGGCACCAATGATGCGGGCGTGAACGCGGTGGTCCAGCGGGACGTCCTC CCCFFFFFHHHHHIJJIGIIJFIHIIIIGIIJIEHIFHHHHHFFDBB;<B?C?BDD8B@-:@CDCB5;@;9;5?@@ XA:i:0="" MD:Z:76="" NM:i:0<="" p="">

>

ADD REPLY
0
Entering edit mode

When I try to convert the same SAM file to BAM with "samtools view -bT genome.fa input.sam > output.bam" I get the following error message: " reference 'HWI-ST974:67:C0545ACXX:2:1101:11263:2144' is recognized as '*'. Parse error at line 4698: unmatched CIGAR operation"

ADD REPLY
0
Entering edit mode

The "something is recognized as *"-error in samtools happens when you use a genome-reference (in your case, genome.fa) that doesn't include all the references that the sam-file input.sam lists.

It looks like you mistakenly aligned the reads to another set of reads, as the reference-name "HWI-ST974:67:C0545ACXX:2:1101:11263:2144" is a standard name for a read and not a chromosome or anything else.

Looking at your original command, you missed inserting the reference. Here's the fixed command:

bowtie -p 16 -q -n 3 -k 1 -m 1 --best --strata genome.fa -1 first_reads.fastq -2 second_reads.fastq > output.sam &

If I'm not mistaken, --best, --strata and --k 1 don't really make that much sense together. --k 1 just reports one alignment, while --strata tries to report all alignments that fall into the best stratum. What are you trying to achieve?

ADD REPLY
1
Entering edit mode

Incidentally, this usage does make sense. The key is the -m 1 (which supercedes -k 1), saying that reads with more than 1 match are not reported. best and strata control the matching criteria: strata says that to count as a match, the alternate match must be in the same alignment stratum.

The relevant snippet from the bowtie documentation is: 'Intuitively, the -m option, when combined with the --best and --strata options, guarantees a principled, though weaker form of "uniqueness." A stronger form of uniqueness is enforced when -m is specified but --best and --strata are not.'

ADD REPLY

Login before adding your answer.

Traffic: 2552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6