I'm trying to pick_de_novo_otus.py on 50 samples (.fa files) that I have trimmed and aligned (not using QIIME). It seemed straightforward, but I can't get the input format of the data right. I thought it could be comma delineated, but it says that the file doesn't exist 'home/Data/KK.01.fa,KK.02.fa,KK.03B.fa,KK.04.fa,KK.05.fa' (command below)
I also tried using -i *.fa instead, but that gave the same result. When processing a large number of samples through the OTU picking, what is the proper way to input your files?
Thanks, I'm unable to run split_libraries_fastq, so I have alternatively trimmed my sequences and merged the paired reads so that I now have individual .fa files for each sample rather than the .fna file. I ended up writing out pick_de_novo_otus.py script for each one in a shell script and running it using bash, and it seems to be working.
I think that by this time, you might have received solution to your problem. Nevertheless, here is the solution.
QIIME1 requires input file in proper way (called QIIME-compatible format). Their tutorial "454 tutorial for de novo OTU picking" has a demultiplexing step. This step does three things: denmultiplexing, quality-filtering and combine all the sequences from all samples in QIIME-compatible way.
What you require is alternate to this step.
The answer is "add_qiime_labels.py" script. This script asks for a folder with all you input files and a mapping file having information about "which file belongs to which sample" [same mapping file used in QIIME with one extra column mentioning name of the file]. On running this script, it will combine all the sequences in QIIME-compatible format and produce a single fasta file (which you can use for OTU picking).
Thanks, I'm unable to run split_libraries_fastq, so I have alternatively trimmed my sequences and merged the paired reads so that I now have individual .fa files for each sample rather than the .fna file. I ended up writing out pick_de_novo_otus.py script for each one in a shell script and running it using bash, and it seems to be working.