Hello,
I am using plotHeatMap from deeptools to on ChIPseq data to create heatmaps and separate them based on k-means clustering.
This also gives me bed files with peaks based on cluster, and the head of the files look like this:
#chrom start end name score strand thickStart thickEnd itemRGB blockCount blockSizes blockStart deepTools_group
14 27039255 27102509 ENSMUST00000225146 . + 27039255 27102509 0 1 63254 27039253 cluster_1
14 27039078 27089249 ENSMUST00000223942 . + 27039078 27089249 0 1 50171 27039076 cluster_1
When trying to see what genes the names correspond to with a GTF file, they do not correspond to anything since they are away from the transcription start site (can be up to 1kb away). Is there a way I can easily add nearest gene to each row of this bed file?
I have tried the following with no success:
- closest-features from bedops
- bedtools closest
- bedtools intersect my bed file and a GTF file
Thank you!
If you provide your
closest-features
and any other set operation commands, I'd be happy to help, if I can. Also, can you indicate if you are getting your GTF file of gene annotations from Ensembl or UCSC?Thanks so much for offering! I am currently using the mm10 Ensembl GTF, and did the following steps
Convert gtf to bed with
Then, I sorted my bed files (after receiving an error that they weren't) with the normal unix sort command
Finally, I used closest features with the two, but the output file had lots of formatting erorrs.