In the fisher test, use alternative = 'less'
i.e. to determine if the enrichment of probes is less in healthy vs. cancer tissue. I have manipulated your data (row 16) to show how this works. Now, the probe at row 16 is quite enriched in cancer (171/179) vs. healthy (5/179).
> ht[16,]
healthy cancer x
(AGGGGG)n 5 171 179
Lets determine it using a fisher test. A p-value < 0.05 will determine if the enrichment is significantly less in healthy vs. cancer:
> ht$pvalues <- apply(ht,1,function(x) fisher.test(rbind(x[1:2],x[3]-x[1:2]), alternative = "less")$p.value)
> ht[16,]
healthy cancer x pvalues
(AGGGGG)n 5 171 179 1.373839e-84
You can also do the reverse, if your first column is cancer & second is healthy, you can use the alternative = 'greater'
option. It is essentially the same thing but you are finding if the enrichment is significantly greater in cancer vs. healthy. See below, I have reversed the order of columns in the function & used alternative = 'greater':
ht$pvalues <- apply(ht,1,function(x) fisher.test(rbind(x[c(2,1)],x[3]-x[c(2,1)]), alternative = "greater")$p.value)
ht[16,]
healthy cancer x pvalues
(AGGGGG)n 5 171 179 1.373839e-84
The p-value is of cancer vs. healthy, assuming the total probes are represented by repeatmasker. There's no way to get a p-value for "just Huh7", because that's not a coherent concept.
Do you think that I can get the names which are significant in cancer?
They're the row names.
It'd be helpful if you gave an example of exactly what you're doing, including the code. A line or two of the data would help too.
I have got the names of the significant one and I would like the one which are in cancer and healthy to see if there is one family which has got a potentiel significance to lead to cancer