how to create a gff3 file of gap regions in an assembly?
1
0
Entering edit mode
6.1 years ago
m.eitel • 0

Hi.

I would like to generate a gff3 file of gap regions ('N') in an assembly. Is there a fast way/script to do that?

Thanks Michael

Assembly • 1.5k views
ADD COMMENT
6
Entering edit mode
6.1 years ago

Ultrafast solution

toy fasta file

$ cat fasta.fa 
>1
TGTACGTNNATT
>2
TTTAANNTTTNN
>3
NNTT
TTNN

solution using seqkit

seqkit locate -p N+ fasta.fa --gtf -P

output

1   SeqKit  location    8   9   0   +   .   gene_id "N+"; 
2   SeqKit  location    6   7   0   +   .   gene_id "N+"; 
2   SeqKit  location    11  12  0   +   .   gene_id "N+"; 
3   SeqKit  location    1   2   0   +   .   gene_id "N+"; 
3   SeqKit  location    7   8   0   +   .   gene_id "N+";
ADD COMMENT
0
Entering edit mode

Dear Vijay. Thanks for the fast reply! I will give it a try. Michael

ADD REPLY

Login before adding your answer.

Traffic: 2319 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6