I have downloaded ChIP-Seq data and managed to get to a point where I have a long list of chromosome positions and some expression data. My question is, how to map these chromosome locations to HUGO gene symbols?
An example of my data is:
Peak GSM365925_ER_minus_ligand_align.bed GSM365926_ER_E2_align.bed
chr20:257411-257873| 7 49
chr20:363265-363667| 0 98
chr20:373762-374404| 3 170
chr20:549324-550256| 1 23
I would preferably like a solution in python, though I could stretch to one in perl and a solution with a GUI interface would also be welcome as I'm still a relatively inexperienced programmer. Thanks
Hi Alex - I am assuming that I need to upload my data using the 'Add Custom Tracks' thing? ... Also, the files I'm analyzing were downloaded and are already in .BED files. For each condition I have one 'peaks' file which I'm assuming is a positive control for analysis and a ' align' file. Are you suggesting I should sort them into another .BED file. Thanks.
No, I would not upload data, but download RefSeq gene names to a local BED file, and then do the mapping between peaks and RefSeq gene names with BEDOPS and standard UNIX command line tools.
If your data are already in BED format, you can just do something like:
$ sort-bed peaks.bed > sorted_peaks.bed
to ensure that the data are sorted. BEDOPS tools were written to take advantage of sorted BED data.I know you said that you are not a programmer, but learning the command line is important enough that I figured I would suggest that approach with some example commands that give you the answer you want. I hope this answer gives you a taste of what you can do.
Hi alex! I'm wanting to do something similar but I only have the start site position in my sorted BED file e.g.
it has about ~20 lines, unfortunately my system doesn't have the most up to date version of gcc req'd for BEDOPS installation... is there another way of doing this at the command line without using BEDOPS? (yes I could install more up to date versions of gcc and g++ but that's currently out of my hands)
Look forward to your response!