Hi, I have FPKM values for 1000 genes and would like the classify these based on FPKM values as high, medium and low expressed. Can someone kindly tell me how to do this using R.
Hi, I have FPKM values for 1000 genes and would like the classify these based on FPKM values as high, medium and low expressed. Can someone kindly tell me how to do this using R.
It depends on how you want to define your classifications. If this is all relative in terms of your single dataset (i.e. you just want to know which of these genes are lowly expressed compared to the rest of the genes in your dataset), then you don't even need to overcomplicate things with R. You can simply open the data in Excel and divide it into thirds based on expression, with the bottom third being your "low", the middle third being your "medium" and the top third being your "high". You can find a decent explanation of how to do this in Excel on this Quora post.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Just a quick note on this: if you do use Excel, make sure to double check that your gene names (if present) remain intact. I have found from dealing with data from other labs in Excel that gene names like "March10" or "Sept4" can be auto formatted by Excel to date format.