Entering edit mode
6.4 years ago
fjs5035
•
0
I'm attempting to use filter_fasta.py in macqiime to remove sequences from a fastq based on a .txt file of read IDs.
filter_fasta.py -f my_reads.fastq -o filtered_reads.fq -s read_ids.txt -n
The input file has 366000 reads. The output file is 366000 reads. Nothing is being removed. I ensured the read IDs are actually represented in the fastq with grep. My read ID .txt file has only one ID per line. Any ideas what could be wrong?
Hi fjs5035
I just added a hyperlink to the script you are referring to.
Additionally, could you add the outputs of following commands, it will you get quick answers
output few lines from your id file
head -n 10 read_ids.txt
output few read headers
grep "^@" my_reads.fastq | head -n 10
There is also a very useful tool in bbMap for your requirement. In case your issue persists, give a try with filterbyname.sh in bbMap suite of tools available at link.
Just a hunch, does your read_ids.txt contain the "@" at the start of the fastq identifiers?
It does. I take it from your comment that that shouldn't happen.
Look at this Remove Reads from fastq file based on read IDs