filtering out the genes in RNA-seq experiment
1
1
Entering edit mode
9.4 years ago
ashkan ▴ 160

Hi Guys

I have a set of RNA-seq data and so far I have prepared my data and the number of raw read counts for each gene for each sample is calculated also I have a matrix in which the columns are samples and rows are genes. now I want to filter out some of the genes to reduce the false positive rate. would you please let me know how I can do the filtering?

Actually I have tried "read count per million" and it is calculated for every gene in every sample but I don't know how to determine the best cut off value for that. (for example can I say if the number of read counts of a gene is 2 or less than 2 and it happens in at least 10 sample this gene must be removed?)

Thanks,
Behzad

RNA-Seq • 5.2k views
ADD COMMENT
0
Entering edit mode

Filtering is generally performed on the adjusted p-values and fold-changes. Have you used edgeR/DESeq2/etc. to calculate that yet?

ADD REPLY
0
Entering edit mode

@Devon: I have not done DE analysis yet. before that I want to remove some genes that are not expressed. as you know even the genes which are not expressed, have few read count.

So I want to filter out these genes.

ADD REPLY
2
Entering edit mode

Just do independent filtering after the fact (if you use DESeq2, this is automatic).

ADD REPLY
1
Entering edit mode
9.4 years ago
alolex ▴ 960

You can use the R function varFilter() that is part of the genefilter package to remove genes that are invariant across all samples. This will remove all non-expressed genes from your list (usually cuts mine by half). If you are using packages like DESeq2, I think it does this for you, so no need to run varFilter() before hand. Also, DESeq2 will adjust the calculated fold change for genes that have low read counts since low read counts can inflate true fold changes, so you shouldn't have to worry about low counts when using DESeq2.

ADD COMMENT

Login before adding your answer.

Traffic: 2274 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6