ssrs filtering from gff3 file
1
0
Entering edit mode
6.4 years ago
moohit21 • 0

Hello everyone. I have a gff3 file of a genome and a .misa file obtained from misa.pl, is there any script or software present that can extract the ssrs that present on the genic regions of the genome or simply to say how to filter the ssrs present in gff3 file using the coordinates ?

Thank you for the help in advance.

RNA-Seq genome ssrs gff • 1.3k views
ADD COMMENT
1
Entering edit mode
6.4 years ago

You can use bedtools intersect

Here are possible steps:

  1. Create a bed file from .misa file

    cut -f1,6,7 your_file.misa > misa.bed

    • Cut the file taking just the columns 1, 6 and 7.
    • col1 is the chr name
    • col6 is the start coordinate of the SSR
    • col7 is the stop coordinate of the SSR
  2. Create a gff file with only genic feature entry from your GFF file (of the genome)

    awk -F "\t" '$3=="gene"{print}' > genic.gff

    • cut the GFF file with -F (field separator as <tab>) and match column 3 to feature "gene" and take the whole line
  3. Now, use bedtools intersect

    bedtools intersect -a misa.bed -b genic.gff

ADD COMMENT

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6