Question

Selecting the first 100 nt from sequences

0

Entering edit mode

4.3 years ago

far.zi ▴ 10

Hi,

I have fasta file containing loci of like 500 introns. I don't know how to have just the first 100 bases using awk command lines. I have the following command that I used to pick the last 100 nt of the sequences. I thought it might help: sed -Ee 's/^.*(.{100})$/\1/' file.fasta

Thanks, Farid

rna-seq • 1.3k views

ADD COMMENT • link updated 4.3 years ago by swbarnes2 14k • written 4.3 years ago by far.zi ▴ 10

1

Entering edit mode

did you try sed -E '/>/! s/^(.{100}).*/\1/'? or you can use seqkit (seqkit subseq -r 1:100). With awk: awk -v OFS="\n" '{getline seq} {print $0, substr(seq,1,100)}'

ADD REPLY • link 4.3 years ago by cpad0112 21k

score 2 · Answer 1 · 2020-08-20

2

Entering edit mode

4.3 years ago

KH ▴ 100

I found this post that should be of use to you:

https://ro-che.info/articles/2016-08-23-fasta-first-n-sequences

ADD COMMENT • link 4.3 years ago by KH ▴ 100

0

Entering edit mode

Thanks for your help. But the file it produced is empty :(

ADD REPLY • link 4.3 years ago by far.zi ▴ 10

0

Entering edit mode

That link had nothing at all to do with your problem. Are you really unwilling to even look at what people are giving you?

ADD REPLY • link 4.3 years ago by swbarnes2 14k

0

Entering edit mode

I don't understand why you are mad at me. I have this question and asked people if they can help me. And so far, from my side, I see only 2 people replied. I don't understand "even look at what people are giving you?".

ADD REPLY • link 4.3 years ago by far.zi ▴ 10

1

Entering edit mode

swbarnes2 is not angry/mad with you and OP is pointing out that you are not applying the answer you already have.

ADD REPLY • link 4.3 years ago by cpad0112 21k

score 0 · Answer 2 · 2020-08-20

0

Entering edit mode

4.3 years ago

swbarnes2 14k

I thought it might help: sed -Ee 's/^.*(.{100})$/\1/' file.fasta

Anyone who understands the command above can figure this out for themselves.

So start there. Find a sed tutorial and learn how that command works. Then you can make your own. It's likely faster than waiting for generous strangers to spoonfeed you what you want to know.