Having trouble using wildcard
1
0
Entering edit mode
5.7 years ago
nattzy94 ▴ 60

I am trying to use the bbsplit function for a number of files. I have done:

for i in {17..34}; do
 bash bbmap/bbsplit.sh \
   in1=./temp_expt/Sample_MBM1${i}/MBM1${i}_R1_001.fastq \ 
   in2=./temp_expt/Sample_MBM1${i}/MBM1${i}_R2_001.fastq \
   ref=./MG1655.fasta,./MGH78578.fasta,./GAPDH.fasta \
   basename=out_%.fq outu1=clean1.fq outu2=clean2.fq ambig2=toss
 done

However, I keep running into a cannot find file error like this:

Can't find file ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq
shell loop command line • 1.9k views
ADD COMMENT
0
Entering edit mode

What happens when you type

  ls ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq

? I reckon those files don't exist

ADD REPLY
0
Entering edit mode

I get './temp_expt/Sample_MBM118/MBM118-TTACGTGC-CAAGGTCT_S66_L008_R1_001.fastq'

NB: I realize the previous command is missing a '1' and should be MBM1${i} but the problem still persists.

ADD REPLY
0
Entering edit mode

well, this is confusing. Your code is supposed to throw a different error.

copy/pasted the code from OP:

for i in {17..34}; do bash bbmap/bbsplit.sh in1=./temp_expt/Sample_MBM1${i}/MBM${i}_R1_001.fastq in2=./temp_expt/Sample_MBM1${i}/MBM1${i}_R2_001.fastq ref=./MG1655.fasta,./MGH78578.fasta,./GAPDH.fasta basename=out_%.fq outu1=clean1.fq outu2=clean2.fq ambig2=toss; done

R1 variable (in1)= MBM${i}_R1_001.fastq R2 variable (in2) =MBM1${i}_R2_001.fastq

if your variable is 17, R1 is MBM17_R1_001.fastq and R2 is MBM117_R2_001.fastq.

Is this a typo?

from one of the replies (from OP) file: ./temp_expt/Sample_MBM118/MBM118-TTACGTGC-CAAGGTCT_S66_L008_R1_001.fastq exists. This means for this file, R1 variable is MBM1${i}_R1_001.fastq not, MBM${i}_R1_001.fastq. Either variable needs to be changed or file name needs to be changed.

ADD REPLY
0
Entering edit mode

Yes, the original command had a typo but I've corrected it and the file still cannot be found. I've edited the initial post to reflect this.

ADD REPLY
0
Entering edit mode

I think OP code still has a problem (assuming that it is updated):

in1=./temp_expt/Sample_MBM1${i}/MBM1${i}_R1_001.fastq.. this would look only for MBM118_R1_001.fastq under sample_MBM118 folder, but not for MBM18*_R1_001.fastq.

Error should be some thing like this: Can't find file ./temp_expt/Sample_MBM118/MBM118*_R1_001.fastq not Can't find file ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq as MBM1 is fixed.

ADD REPLY
0
Entering edit mode

You can check the file existence with following code. Please change the path as per your convenience:

for i in {17..34}; do if [ -e ./MBM1${i}_R1_001.fastq ]; then echo "exists";fi;done

@ nattzy94

ADD REPLY
0
Entering edit mode

FYI, I've restructured the command from a one-liner just to make it a little easier for people to debug since it was quite long.

Can't find file ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq This error would suggest that it's not interpreting the wildcard in the shell and is looking for a literal *.

Do you have any more information or other error messages to go on, because there doesn't look to be a correspondence between the code and the error at the moment..

ADD REPLY
0
Entering edit mode
5.7 years ago
Michael 55k

Your search patter doesn't match the file names.

in1=./temp_expt/Sample_MBM1${i}/MBM1${i}_R1_001.fastq \ 
in2=./temp_expt/Sample_MBM1${i}/MBM1${i}_R2_001.fastq \

while your file names look like

MBM118-TTACGTGC-CAAGGTCT_S66_L008_R1_001.fastq

If you are sure there is only on pair of files per sample id, you could simply:

in1=./temp_expt/Sample_MBM1${i}/MBM1${i}-*_R1_001.fastq \ 
in2=./temp_expt/Sample_MBM1${i}/MBM1${i}-*_R2_001.fastq \

But if there is a chance that there are multiple with e.g. different tag sequence, then you should do it a bit differently.

ADD COMMENT

Login before adding your answer.

Traffic: 2586 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6