Entering edit mode
4 months ago
l.gallucci
▴
20
Hi all,
I'm actually trying to build a pipeline for microbial transcriptome data (Illumina 2x150 bp).
Basically up to now my work was based on classic qc and trimming with trimmomatic.
Then I moved to sortmerna for rRNA and non-rRNA separation. I'm a bit confused on how to proceed for getting taxonomy and functional genes now...I used Phyloflash (with trimmed reads, before SortMeRNA) but also this step is not really clear. I get assembly with spades, taxonomy, but I'm a bit lost on how to recover the taxonomy for external plotting outside the html of Phyloflash.
Hello,
I would suggest to align your reads [trimmed reads] to non-coding database like Rfam , then get the unmapped reads then do an assembly and then gene prediction.
For taxonomy classification, you can use the gene prediction sequences to classify against NT database using kraken2 or centrifuge. And further to map functional annotation, you can go with uniprot protein seqeunces or NR database.
Hope it helps!