Run Crossbow1.1.1 On E.Coli Small Example
2
2
Entering edit mode
13.7 years ago
Mcmahanl ▴ 300

Hi crossbow1.1.1 users,

I hope that you can help me on how to run crossbow on e.coli (small) example indicated on the crossbow web page.

I had downloaded and installed crossbow 1.1.1 and sratoolkit.2.0rc5-mac64 on my mac desktop running OSX 10.6.6. I followed the direction of running crossbow on e.coli (small) example on a single computer via the command line at http://bowtie-bio.sourceforge.net/crossbow/manual.shtml#single-computer but unable to run crossbow on e.coli (small) example on my mac desktop with OSX 10.6.6.

------------------------------------------------------------------------------------------
Run crossbow on e.coli example using small.manifest file:
------------------------------------------------------------------------------------------
LOCATION WHERE I RUN CROSSBOW:
mcmahan$ pwd
/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/example/e_coli

mcmahan$ ls
full.manifest    small.manifest

RUN CROSSBOW WITH FOLLOWING SPECIFIED OPTIONS (I followed the web crossbow manual):
mcmahan$ $CROSSBOW_HOME/cb_local 
--input=small.manifest 
--preprocess 
--reference=$CROSSBOW_REFS/e_coli 
--output=output_small 
--all-haploids 
--cpus=1 
--bowtie=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/bin/mac64/bowtie 
--soapsnp=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/bin/mac64/soapsnp

I am not sure whether the options I used to run crossbow are correct. Any suggestions on what I should try so that I can successfully run crossbow on e.coli crossbow example on a single computer via command line. This is the error message I got when I ran crossbow on e.coli (small) example on single computer via the command line on mac desktop running OSX 10.6.6

Thanks, Linda

next-gen sequencing • 3.4k views
ADD COMMENT
0
Entering edit mode

Linda, what problems are you having? Are you getting an error message or are the results not as you expect? Thanks for all the detail you provided; we'll also need some more information on the problem to be able to help. Thanks.

ADD REPLY
0
Entering edit mode

It is recommended to use novoalign/stampy/bwa for mapping and GATK/samtools for SNP calling.

ADD REPLY
0
Entering edit mode

[?]lh3[?], thanks. I have been looking into bwa and samtools. - [?]mcmahanl[?]

ADD REPLY
0
Entering edit mode

lh3, thanks. I have been looking into bwa and samtools. - mcmahanl

ADD REPLY
0
Entering edit mode

Thanks to the help from Shantanu K. Karve at SEQanswers. I was able to run crossbow1.1.1 on e.coli sample. For detail, see Answew 2 below.

ADD REPLY
1
Entering edit mode
13.7 years ago
Darked89 4.7k

Crossbow fails to pass the same stage on Ubuntu 10.04 laptop. But I am getting low disc space plus different sizes of extracted files:

-- Map counters --
Short read preprocessor Read data fetched   930624152
Short read preprocessor Read data fetched (un-SRAed)    272420864
Short read preprocessor Read data fetched (uncompressed)    272420864
Short read preprocessor Read data pushed to local filesystem (compressed)   65549172
Short read preprocessor Read files pushed to local filesystem   205712761
Short read preprocessor Unpaired URLs   1
Short read preprocessor Unpaired reads  1723518
Short read preprocessor Warnings    0
Short read preprocessor Read data fetched   930624152
Short read preprocessor Read data fetched (un-SRAed)    272420864
Short read preprocessor Read data fetched (uncompressed)    272420864
Short read preprocessor Read data pushed to local filesystem (compressed)   65549172
Short read preprocessor Read files pushed to local filesystem   205712761
Short read preprocessor Unpaired URLs   1
Short read preprocessor Unpaired reads  1723518
Short read preprocessor Warnings    0
Removing /tmp/crossbow-c1ICtsG_kE/crossbow/intermediate/8464/preproc.map.pre (to keep, specify --keep-all)

My guess is that it may simply run out of space in /tmp while unpacking and fail on the next step. Will check it on a larger computer tomorrow.

EDIT 1 Managed to reproduce your exact error messages on Fedora 12 with plenty of space (after run):

df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdc1              19G  8.0G  9.5G  46% /</p>

It does create gzipped fastq files in

/tmp/crossbow-XnS4mxpxiu/crossbow/intermediate/12351/preproc/

I did run the first command which failed by hand (line breaks for clarity)<

perl /users/dk/soft/crossbow_1.1.1/Align.pl 
--bowtie /users/dk/soft/crossbow_1.1.1/bin/linux64/bowtie 
--discard-reads=0 
--index-local=/users/dk/soft/crossbow_1.1.1/CROSSBOW_REFS/e_coli/index/index 
--partlen=1000000 
--qual=phred33 
--counters /users/dk/soft/crossbow_1.1.1/example/e_coli/output_small_counters/counters.txt 
--truncate=0 
-- 
--partition 1000000 
--mm -t 
--hadoopout 
--startverbose -M 1

This also failed. One thing which looks like a bug is orphan -- option. You also had it in your error output. To bee continued...

EDIT 2

Looks like CrossbowIface.pm is responsible for producing this garbled command line.

The -- simply makes other options invisible to Align.pl, but the whole command seems to be a broken mixture of kosher Align.pl options and some borked names. Search for truncateDiscard in CrossbowIface.pm.

ADD COMMENT
1
Entering edit mode
13.7 years ago
Mcmahanl ▴ 300

(1) Following is the pasted email from Shantanu K. Karve at SEQanswers to me on how to run crossbow1.1.1:

Hi Linda,

Let me first apologize for the delay.. I've been under the weather. I'm just running it in local mode myself and your problem is what I faced when running it Hadoop cluster as well so I can help you. Basically, Stage 1 ( the preprocess leaves a spurious file in the preproc directory. The Align step doesn't recognise that contents of that file, so it quits. So the solution is to delete that spurious file and get the process to pick up from that point again.

In more detail, the cue is this error message:

Bad number of read tokens ; expected 3 or 5:

You'll see that in your case, the Align step is trying to process intermediate/10720/preproc/map-10742

That is the spurious file.

So I run with --keep-all option so all intermediates are kept. After it fails, I delete that spurious file, That leave proper good SRA*.out.gz files, 18 of them, in the preproc directory

Then because of the -keep-all option the run specific perl scripts are kept in the /tmp/crossbowXXXX/invoke.scripts directory so they are available to me to edit.

I just edit the cb.* one that's pertinent to a local run ( not the ones for hadoop on AMZN ). The edit consists of taking out the code that runs the first step ( since that's already been run so that scripts runs things from the ALign step, Step 2. So when I've edited the cb_* script this is how it starts ..

#!/bin/sh
phase=1
phase=`expr $phase + 1`
perl /home/karve/crossbow-1.1.1/MapWrap.pl 
    --stage $phase 
    --num-stages 4 
    --name Align 
    --input /home/karve/crossbow-1.1.1/tmp1/preproc 
    --output /home/karve/crossbow-1.1.1/tmp1/align 
    --counters /home/karve/crossbow-1.1.1/output_small_counters/counters.txt 
    --messages cb.local.$$.out 
    --keep-all 
    --force 
    --mappers 2 -- \

etc etc.. Hope that's clear.

-Shantanu

So I just get out it then change the perl program that runs all the steps cb_*****.sh

(2) I also tried the following and it works: (2.1) Run crossbow with these options:

$CROSSBOW_HOME/cb_local 
--input=small.manifest 
--preprocess 
--reference=$CROSSBOW_REFS/e_coli 
--output=output_small 
--all-haploids 
--cpus=1 
--bowtie=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/bin/mac64/bowtie 
--soapsnp=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/bin/mac64/soapsnp 
--preprocess-output=preprocess_output 
--keep-all

(2.2) After it fails, I delete the spurious file map_XXXXX in preprocess_output directory. That leave proper good SRA*.out.gz files (18 of them) in the preprocess_output directory. Then I rerun crossbow on all those 18 good SRA*.out.gz files with following option:

$CROSSBOW_HOME/cb_local 
--input=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/example/e_coli/preprocess_output 
--reference=$CROSSBOW_REFS/e_coli 
--output=output_small 
--all-haploids 
--cpus=1 
--bowtie=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/bin/mac64/bowtie 
--soapsnp=/Users/mcmahan/mcmahan/downloadApplications/crossbow/crossbow-1.1.1/bin/mac64/soapsnp
ADD COMMENT

Login before adding your answer.

Traffic: 2028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6