I have a dataset with overlapping samples. We did metagenomic sequencing for one (I think this may be a coassembly of multiple samples but I'm not sure) and we also have metatranscriptomics for various samples that are related. I ran metaSPAdes and prodigal for ORF calls using the metagenomic reads. I mapped my metatranscriptomic reads to the metagenomic assemblies but I'm finding a good amount that didn't map. I'm running rnaSPAdes on the metatranscriptomic reads that didn't map to the metagenomic assembly.
My question is the following:
Can (or should) I use TransDecoder (my preferred de-novo transcript ORF caller) to calls ORFs on my de-novo transcripts from rnaSPAdes and use them in the same "dataset" as my prodgial ORF from metagenomics?
Or should I just use prodigal on my rnaSPAdes assembly?
The final product if I do the former would be:
Prodigal ORF calls from the metagenomics assembly with reads mapped from metatranscriptomics
TransDecoder ORF calls from metatranscriptomics assembly with reads mapped from metatranscriptomics
My worry is that different technologies might have different bias when merging them in the same dataset although there won't be the same data going in.