Hi everyone,
I'm a wet lab biologist transitioning to bioinformatics. I've just started working in a lab and one of my first tasks is to handle some unused whole-genome data from a few years ago, generated with 10x Genomics technology. The goal is to perform de novo assembly and eventually annotation.
From my research, the main option for assembly seems to be the 10x Genomics Supernova software. I have a couple of questions:
- Are there other programs or open-source tools that can work easily with 10x Genomics data for de novo assembly?
- The lab's PCs have at most 128 GB of RAM, but Supernova documentation recommends at least 256 GB. I’m considering using an AWS EC2 instance for the assembly and then performing the annotation on my local workstation. Does this approach make sense, or is there a better way to handle this? Any advice or insights would be greatly appreciated. This is my first time doing de novo assembly, working with 10x data, or using AWS.
Thanks for the help!
10x stopped supporting the "linked read" technology 4 years ago so you are going to have to work with
supernova
and hope that things will work with current versions of operating system (libraries etc can be an issue). You can try your luck with your lab comp first. Depending on the genome you are working with 128GB may be enough. I think those recommendations are for human sized genome.I don't recall if this data could be used on its own to generate a complete genome de novo assembly. It was meant to improve local regions/phasing of the genomes. Hope your data is good quality.
Thank you for answer!
I'm currently in the process of estimating the genome size, but I expect it to be in the 2-3 GB range. It's good to know that Supernova is the (only) way to go for de novo assembly with 10x Genomics data.
I did find a paper where they used 10x Genomics data and Supernova for de novo assembly, followed by MAKER and other tools for annotation, so at least I know it's possible.
Thanks again!