Hello,
I have created two non-redundant gene catalogs from different ecosystems: one marine and one terrestrial. These catalogs were generated using de novo assembly, gene prediction, and clustering. I then calculated the gene abundances for the corresponding samples based on their respective catalogs.
Now, I want to compare these two ecosystems through samples' abundance. However, the abundance data is currently calculated relative to their own catalogs. Both of them is scaled up to 1 million but I think It is a problem since their catalog size is different, isn't it?
Do you have any suggestions on how to approach this comparison?
Thank you very much.
You could make a new assembly of the combined marine+terrestrial sequence reads. Alternatively, you could cross map the reads between marine and terrestrial assemblies to find the genes that are present in both.