How to add a refseq() slot to a MAG based phyloseq object ?
0
0
Entering edit mode
17 months ago

Hello, I was asked to build a phyloseq object from some metagenome assembled genomes that I built. By far I have created a tax_table() slot from the classification of the MAGs using GTDB-Tk, a out_table() that I made using coverM and added a metadata table from the collected samples.

Now the goal is to build the refseq() slot to store all the MAGs fasta files to the phyloseq object. I know that all phyloseq utilities were built for ASVs/OTUs but want to know if it is possible to add MAGs fasta files to the phyloseq object.

So, is there a way in which I can add the MAGs files to the phyloseq refseq() slot ? on the other hand, given that the MAGs fasta consists in fragmented contigs , should I first concatenate these contigs using something like \n before adding them to the refseq() if it is possible ?

By far I have stored all the MAGs fasta files to a DNAStringSet class object:

mags_seqs <- lapply(Sys.glob("*.fa"), Biostrings::readDNAStringSet)

bests,

Valentín.

phyloseq fasta R MAG • 904 views
ADD COMMENT
0
Entering edit mode

So, is there a way in which I can add the MAGs files to the phyloseq refseq() slot ? on the other hand, given that the MAGs fasta consists in fragmented contigs , should I first concatenate these contigs using something like \n before adding them to the refseq() if it is possible ?

This is just suggestion:

From the output of GTDB-tk analysis you should have a folder called align, with a file inside named gtdbtk.bac120.user_msa.fasta.gz.
Why don't you store the concatenated marker genes of each MAG from the MSA analysis instead of the entire genome sequence?

ADD REPLY

Login before adding your answer.

Traffic: 2572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6