I have two excel files which have quite a a lot common gene names. What I would like to do is finding the common gene names and adding the corresponding columns with their values to a new csv file.
The below code gives an output containing a single row, which should not be the case as I have sufficient common gene names. Therefore, can someone please tell me what I'm doing wrong here?
Thanks!
First excel file,
Second excel file,
Ouput what I got, (the formatting is correct, but why am I getting only one?)
Snippet from the code,
Are you certain there are absolute matches between the two? The example shows df2 with many gene names capitalized. These won't match if they are not also capitalized in df1. Can you sort each and show a top set with names in common? Could there be a trailing space, or other character in one of the gene name columns? (unlikely, but I've seen this happen).
Thanks! I have made both the files lowercase and run the same file. This time it returned a list of common genes, but subsequent column are missing now. I am getting output like this,
According to your code, you asked for a set of common values, and that's what it returned. Are you trying to merge the two data frames with these values? (I gave an answer for this below). Or are you trying to subset one data frame by the values in common? What code are you running for the image above? What were you trying to achieve?