filter_fasta.py not removing sequences from fastq based on read IDs

0

Entering edit mode

6.4 years ago

fjs5035 • 0

I'm attempting to use filter_fasta.py in macqiime to remove sequences from a fastq based on a .txt file of read IDs.

filter_fasta.py -f my_reads.fastq -o filtered_reads.fq -s read_ids.txt -n

The input file has 366000 reads. The output file is 366000 reads. Nothing is being removed. I ensured the read IDs are actually represented in the fastq with grep. My read ID .txt file has only one ID per line. Any ideas what could be wrong?

sequencing sequence fastq qiime • 2.0k views

ADD COMMENT • link updated 6.4 years ago by lakhujanivijay 5.9k • written 6.4 years ago by fjs5035 • 0

2

Entering edit mode

Hi fjs5035

I just added a hyperlink to the script you are referring to.

Additionally, could you add the outputs of following commands, it will you get quick answers

output few lines from your id file

head -n 10 read_ids.txt
output few read headers

grep "^@" my_reads.fastq | head -n 10

ADD REPLY • link 6.4 years ago by lakhujanivijay 5.9k

0

Entering edit mode

There is also a very useful tool in bbMap for your requirement. In case your issue persists, give a try with filterbyname.sh in bbMap suite of tools available at link.

ADD REPLY • link 6.4 years ago by Jeffin Rockey ★ 1.3k

1

Entering edit mode

Just a hunch, does your read_ids.txt contain the "@" at the start of the fastq identifiers?

ADD REPLY • link 6.4 years ago by cschu181 ★ 2.8k

0

Entering edit mode

It does. I take it from your comment that that shouldn't happen.

ADD REPLY • link 6.4 years ago by fjs5035 • 0

0

Entering edit mode

Look at this Remove Reads from fastq file based on read IDs

ADD REPLY • link 6.4 years ago by lakhujanivijay 5.9k

Login before adding your answer.