Extract gff of a particular chromosome
1
1
Entering edit mode
8.4 years ago
mhasa006 ▴ 70

I have a gff file that contains all the information from Chromosome 1 - Chromosome 14. But I need gff information on individual chromosome basis. For example, I am performing some experiment on Chromosome 11 and trying to visualize my result on IGV. When I load the gff file in IGV it is showing gene information of all the chromosome. How can I get the gff of only Chromosome 11?

gff3 chromosome IGV • 8.9k views
ADD COMMENT
0
Entering edit mode

Why are you bothered by IGV showing you all chromosomes? You just need to double-click on the chromosome you are interested in to select and zoom to just that chromosome (or use the drop-down menu).

ADD REPLY
0
Entering edit mode

Hello, I was trying to grep chromosome X from a gff file. I have an output but it is empty. I am a beginner in this. Please help me.

ADD REPLY
0
Entering edit mode

Ok, you are not supposed to ask questions in other threads but before you refresh even more old ones, please show the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c and post the command you used.

ADD REPLY
0
Entering edit mode

hanks a lot. Sorry, I am not aware of it. I am still not getting a file with chromosome 'x' only. I have used grep chrX myfile.gff>chrx.gff

ADD REPLY
0
Entering edit mode

@ahmedferoz20 I deleted your comment because you added it as an answer instead of using Add Reply.

What is the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c

ADD REPLY
0
Entering edit mode

Yep ok i ad doing it now.

  1. I have used grep chrX myfile.gff>chrx.gff to extract output of chrX.

2 The output of the script you gave me is

## species https://www.ncbi.nlm.nih.gov/Taxonomoy/Browser/www.tax.cgi?id=70

I hope i followed your suggestion to get help.

ADD REPLY
1
Entering edit mode

What? A link to ncbi is for sure not the output. head -n 20 your.gff will also do.

ADD REPLY
0
Entering edit mode

@ATpoint, I admire your patience.

ADD REPLY
0
Entering edit mode

We all were unexperienced at some point. I assume that the problem is that the chromosomes are labelled as 1,2,3...X rather than chr1,chr2,chr3...chrX, that is why I asked for the cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c because that will list the unique chromosome names.

ADD REPLY
0
Entering edit mode

Thanks a lot. I got it. However, my next challenge is to create a fasta file from that chromosome x file. I have to create a fasta file which contain 1000bp upstream of each gene.

ADD REPLY
6
Entering edit mode
8.4 years ago
GenoMax 147k

Assuming your chromosomes are named chrNN the following should extract chr11

grep chr11 your_file.gff > chr11.gff

If chr11 is in the first column and you want only those lines then do

grep ^chr11 your_file.gff > chr11.gff
ADD COMMENT
3
Entering edit mode

Small correction, some times when you grep 'chr1' you end up in getting 'chr11', 'chr12' etc., adding '-w' will solve the problem.

grep -w chr11 your_file.gff > chr11.gff

ADD REPLY
2
Entering edit mode

With awk for exact matches, preserving the header:

awk '$1 ~ /^#/ {print $0;next} {if ($1 == "chr11") print}' your_file.gff
ADD REPLY

Login before adding your answer.

Traffic: 2328 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6