I have two columns of data containing merged peak chr start, start, end, gene name data that looks like the following:
Essentially what I did was copy and pasted the values I obtained using a script that finds peaks that are similar between different chip-seq experiments. Column 1 contains peaks in protein 1, 2, and 3 and Column 2 contains peaks in protein 1, 2, 3, and 4.
What I am trying to do now is find all the duplicated values in Columns A that are also found in Column B ... not just by row but if there is a similar value in B2 and it is found in A4506 then I want it to find that value also and then spit all the duplicated values into Column C.
While this dataset isn't essentially "huge" ... only about 4.5k rows in either column, it's too much to go through by hand and I can't find a good answer on any internet topic. I've tried varieties of VLOOKUP
, IFCOUNT
and other functions but they only search the row 1 in column A to the row 1 in column B or simply do not work on the Mac version of Excel (for instance I found a add-in that could supposedly do this but the add-in was not able to be used by Mac, I got an error indicating it was not properly formatted for mac).
Can anyone help or point me to a correct source?
Hi, do you want any kind of solutions (with scripts) or only Excel solution?
Any solution will work. I am a wet lab research assistant with only minor knowledge of programming languages so if you could just try to give as much as a detailed overview as you can be bothered of how your script works and what data would go where that would be appreciated.
EDIT: I totally missed the excel answer you found. I will attempt this, but it seems to based on a windows os from screenshots. Still I should be able to reproduce it. I will edit with any results.
EDIT: Excel link did not work.