I have a text file with ~ 60,000 sequences ( 1 sequence per line) I am trying to extract all the sequences that begin with an A (first nucleotide position in the sequence ) and in the same sequence there should a T at the 10th position. For example:
Let's say these are 3 sequences in the 60,000 sequence file:
AAGGGCAGCTAATCGCCAGTG
CGGGATCTATAAGGTTGGT
AAGGGCAGCGAATCGCCAGTGAGGCT
If the search was done for the 3 sequences- my desired output needs to be only the first one.
I have tried some approaches with grep , but it has not worked out. Any help or suggestion on this matter will be greatly appreciated.
Thanks and regards !
Wonderful ! Thanks ! That worked out great. Just needed to remove the -m1 at the end as I wanted to search through the whole file.