Loosing genes when merging several RNAseq datasets

1

Entering edit mode

3.2 years ago

BlueSky ▴ 10

I'm trying to merge several RNAseq datasets together. I have already preprocessed datasets (uniformly processed), and they contain different numbers of probe ids because they are already filtered, so when I use the merge() function to merge them I get the probe ids that they have in common, but then I loose information from some of the datasets with more probe ids in them. How do I deal with that?

RNAseq merging • 1.3k views

ADD COMMENT • link 3.2 years ago by BlueSky ▴ 10

1

Entering edit mode

In the merge function add the argument all = TRUE

ADD REPLY • link 3.2 years ago by andres.firrincieli 3.8k

0

Entering edit mode

Thanks for answering, but what do I then do with the cells that will be filled with NA? Do I just put them to 0, or will that be wrong?

ADD REPLY • link 3.2 years ago by BlueSky ▴ 10

1

Entering edit mode

The answer to this depends on the questions you will be asking of the data set. Your merged data set will be misleading for any question where 0 might be a meaningful answer. Whereas NA might prevent you from being able to ask questions about certain genes. The caveats are yours to compose, as long as you can carry them along, communicate them, and not mislead yourself or others by them.

ADD REPLY • link 3.2 years ago by seidel 11k

0

Entering edit mode

I want to do PCA and hierarchical clustering (preferably with top DEGs; here I suppose the 0 will give me problems in the Heatmap?). Would it be better to just use the common genes and not keep the ones that are not common then?

ADD REPLY • link 3.2 years ago by BlueSky ▴ 10

0

Entering edit mode

Hello there,

can you tell please a little bit more about the circumstances? E.g. show an example of your problem, clarify which software you use (R, Python, Java, etc)

ADD REPLY • link 3.2 years ago by Olli • 0

Login before adding your answer.