grep DNA after a pattern
3
0
Entering edit mode
5.3 years ago
amitpande74 ▴ 20

HI,

I have DNA sequences like these :

and I want to keep all sequences after this pattern "ACTTAAGTGTATGTAAACTTCCGACTTCAACTG" beginning with "TA". I tried

grep -v "ACTTAAGTGTATGTAAACTTCCGACTTCAACTG" file.txt

but it does not work.

Kindly help.

DNAseq Grep • 1.8k views
ADD COMMENT
1
Entering edit mode

Please go through this post to add images properly in the post

A: How to add images to a Biostars post

Also, do you want to get rid of "ACTTAAGTGTATGTAAACTTCCGACTTCAACTG" ?

ADD REPLY
0
Entering edit mode

Not sure what you mean. If you want sequences that start with the pattern followed by TA then look for the whole thing, i.e. ACTTAAGTGTATGTAAACTTCCGACTTCAACTGTA

ADD REPLY
0
Entering edit mode

@ lakhujanivijay yes

ADD REPLY
5
Entering edit mode
5.3 years ago
Benn 8.3k

Your code (grep -v) selects only lines that do not have the string ACTTAAGTGTATGTAAACTTCCGACTTCAACTG, so zero. If you want to select only ACTTAAGTGTATGTAAACTTCCGACTTCAACTG followed by AT, but dropping the ACTTAAGTGTATGTAAACTTCCGACTTCAACTG string, you can use grep in combo with sed.

grep "ACTTAAGTGTATGTAAACTTCCGACTTCAACTGTA" file.txt | sed "s/ACTTAAGTGTATGTAAACTTCCGACTTCAACTGTA/TA/g"
ADD COMMENT

Login before adding your answer.

Traffic: 2654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6