Entering edit mode
8.3 years ago
tejaswi.iyyanki
•
0
I have list of homer results. For example:
CTCF(Zf)/CD4+-CTCF-ChIP-Seq(Barski_et_al.)/Homer
I usually extract the first letters using pattern - CTCF.
But there are several others where I cannot readily extract the gene ID. For example:
GATA(Zf),IR3/iTreg-Gata3-ChIP-Seq(GSE20898)/Homer
Is there a way to convert all these homer motif results into gene ID so that I can easily overlay fpkm values or differential expression of genes into homer results?
I am interested in selecting motifs whose gene is expressed.
Thanks!
As there is no specific pattern to extract only the gene id, quick thing I can think of is to use python or perl to match (regex) all the gene names ( from a list of gene names) against the homer output and extract the matched portion along with the other information you need.