How to match gene name of two dataset with different annotation version ?
1
0
Entering edit mode
6.1 years ago
Wayne Lee ▴ 10

Hello everyone!

If I have two gene expression dataset with different gene annotation version, how to match gene name between this two dataset?

Thanks, Wayne

RNA-Seq gene annotation • 1.7k views
ADD COMMENT
0
Entering edit mode

Please post some example data from both datasets and example output required.

ADD REPLY
0
Entering edit mode

For instance, there have two gene expression datasets, the row represent each gene and the column is sample, now I want to merge this two datasets according to the row. As they use different gene annotation version, result in there have some rows mismatch, i.e. some gene name in dataset 1 not be contained in dataset 2, vice versa. So I want to know how to deal with this mismatch gene? Just remove them or do some process to reduce information loss?

Thanks

ADD REPLY
0
Entering edit mode

Please be more specific. What are the genome versions? What are the annotation versions? Where did you obtain this count data?

ADD REPLY
0
Entering edit mode

Wayne Lee : One additional thing is to consider if the sets came from two different experimental techniques/origins. You should not merge them as is in that case.

ADD REPLY
0
Entering edit mode
6.1 years ago
h.mon 35k

It is more complex than just matching gene names between versions. Even for the same underlying genome version, genes can be retired, there can be new genes, and some gene versions may have different genomic coordinates. Ideally, all datasets should be quantified using the same genome and annotation versions.

ADD COMMENT

Login before adding your answer.

Traffic: 1615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6