Entering edit mode
7.5 years ago
niu.shengyong
▴
70
I'm faced with a problem in the volcano plot of CummeRbund. As shown in enclosed file, you could see that there are plenty of infinite dots on the borders and also on the bottom. Is there any method to avoid this problem? Volcano plot
I've checked some pages, it doesn't work when I put "pseudocount=0.0001" argument in csVolcano function. Do you have any advice? Thanks!
Sincerely, Simon
Can you show the code where you add the pseudocount? It is usually +1.0 and must be added before log normalising function.
Also print the matrix created and extract the infinite values --> trace those values back to the original input values before normalising --> maybe the input values are "NA" or something weird.
you can create the plot on your own. Just pull put the adj.Pval and the log2FC . Pull the values and plot the distribution first and see the value limits. Then it should not be a problem. Otherwise, you can always inflate with a constant counter for representation. use something like below. You have NA values for p.adjusted. So that's the problem. if it's a visualization you might not need them as well. Get rid of them and plot.
First: extract the ID, adj.Pval and log2FC from cuff_data into a dataframe called "df"
Second: remove all lines where log2FC=NA
Third: follow the code vchris posted here.
Should I use the file "gene_exp.diff" and extract the columns of "gene_id", "gene", "sample1", "sample2", "log2(fold_change)", and " p_value"? Thanks!
Yes gene_exp.diff provides the gene differential FPKM after cufflinks tests differences in the summed FPKM of transcripts sharing each gene_id.
Take a look here for the structure of the gene_exp.diff file.
To reliably extract what you need requires knowledge of a programming language (perl, python) or terminal commands (cut -f). If you do not have that ability, then you can use a spreadsheet application like Excel but prepare to get EXCELLED!!
oh that's some handy code there.
I could delete infinite points successfully, but all the them become in grey color in the end. How could I solve this? Here is my codes:
Here is the plot: volcano plot
Sorry I'm pretty new in this area. How could I extract the ID, adj.Pval and log2FC from cuff_data into a dataframe called "df", and also remove all lines where log2FC=NA? Appreciate it!
Simon
Here is my code:
How could I add this before normalization and log function?
Thanks!
Here is my code:
How could I add this before normalization and log function?
Thanks!
Please also include the code where "pseudocount=0.0001" is added.
Thanks!