How to search a .narrowPeak file for specific genes via a batch file
0
0
Entering edit mode
23 months ago
StacyG • 0

I'm doing a ChipSeq analysis for the first time and have some basic questions. I successfully ran macs2 callpeak() and have a .narrowPeak file that I can load into IGV. I also have an .xls file with the names of specific genes we are interested in. I can load my .narrowPeak file into IGV, manually type in the gene name, and determine if my TF binds but the list is over 5000 genes long so doing this manually isn't an option. Does anyone know how I can do this via a batch file? Either with IGV or in R? I need output that lists each of the genes with a column of 0/1 to indicate if the gene bound somewhere in my .narrowPeak file.

Thanks in advance for any help, Stacy

binding transcription factor peaks • 913 views
ADD COMMENT
0
Entering edit mode

Look at the GenomicRanges Bioconductor package. It has plenty of functions for overlaps, nearest/closest operations. Excel can be loded with openxlsx and narrowPesk is simply a text file, read.delim without header will do. The GenomicRanges manual covers how to create GRanges objects for overlap analysis.

ADD REPLY
0
Entering edit mode

Thanks...I'll take a look. The other thing I need to figure out is how to get the chromosome start/end positions. My .xls file has chromosome names but not positions...any ideas?

ADD REPLY
0
Entering edit mode

please show data

ADD REPLY

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6