Hi, I have a list of genes, along with their variant densities. I can decide which variants to use based on their predicted impact. I want to perform GSEA using the densities. Is my plan fine so far? Is there any other effect from using density that I should watch out for? I'm asking because I couldn't find any similar examples. Manuel Landesfeind suggested a workaround over here, but I'm looking for something better.
This is my current plan according to the GSEA paper:
1) Separate patient data based on 1 phenotype of interest.
2) Calculate the overall average variant density for each gene per person in this phenotype. Do the same for the people who lack the phenotype, and other major diseases as well since those aren't the focus.
3) Find the density ratio between the two.
4) Use this to produce the ranked gene list L = {g1,...,gN}. I'm a little wary about step 4. How should I rank this? Is this a good method? Or should I do something like GWAS? I'm trying to avoid using GWAS here.
5) Find the variant density for each patient. I will then continue using some other program, but don't know which is the best one to use. Ingenuity IPA doesn't have a generic value option, only expression value options. It converts to logs automatically, which I don't think is appropriate for comparing density ratios. I can use PANTHER to get gene sets of interest but don't know how to test them.