Grep search question
1
0
Entering edit mode
2.2 years ago
Omurice ▴ 10

Hi,

I was wondering if I have 2 sequences of interest: A and B.

Is it possible to do a grep search for reads (in a fastq):

  • that contain just sequence "A" and exclude "B" (and vice versa)
  • that contain both sequence "A" and "B" together

if so, would someone be able to provide/refer me to the commands to do so?

Thank you so much!

fastq Sequence grep sequencing fasta • 665 views
ADD COMMENT
1
Entering edit mode

Keep in mind that grep will only do perfect matches. You may want to use a NGS data specific tool like seqkit grep or bbduk.sh in filter mode, if you want to allow for other cases than perfect matches.

ADD REPLY
2
Entering edit mode
2.2 years ago
Asaf 10k

You can do it with:

# Grep sequences with A
grep $A file.fastq -B 1 -A 2 > hasA.fastq
# Grep B where A is found
grep $B hasA.fastq -B 1 -A 2 > hasAandB.fastq
# Remove lines in file 1 that are found in file 2
comm -23 hasA.fastq hasAandB.fastq > hasAnotB.fastq

Replace $A and $B with the sequences or set the value of the variables to the sequences you are searching.

ADD COMMENT

Login before adding your answer.

Traffic: 2715 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6