Assembly Question
1
0
Entering edit mode
9 weeks ago
parb2182 ▴ 10

Hello,

First of all thank you for your time and for reading this.

I got a nanopore data and I was tasked to assemble it. First thing I did was I used porechop to trim the file and kept the fastq format. Then I used minimap2 and samtools to filter out the good quality reads since I was told that they might have contamination. I aligned it with a reference file that I was given. It's basically the parent of the virus I'm trying to assemble.

I did this by putting the -q value to 7 because I wanted 80% and I read that the value is calculated based on the following: SCORE = 1-10^(-MAPQ/10) and the value for 0.8 was 7. (Please also let me know if this is too low, thank you)

After that I used the "samtools fasta input.fastq > output.fasta" to make the filtered virus read into fasta.

Then I used flye to assemble them. I got a assembly.fasta file and it has multiple >contigs n in it. I was told that the parent virus has around 190kbp but the biggest contig has like 60kbp and the others are less.

I tried finding this before but nothing that could help me on what to do?

I don't know if I have to assemble this assemble again or I have to somehow try to map them together.

Any help is appreciated.

Question Flye Contig Assembly • 429 views
ADD COMMENT
0
Entering edit mode

I dont think anyone will be able to help you with this amount of information. Just from experience Flye generally doesn't perform well with reads less than 5kb, so you could try filtering for just a set of 30-40X of the longest reads

Depending on your contamination possibility (which should be obvious now after your alignment and filtering) you could down-sample before or after you this step.

Goodluck

ADD REPLY

Login before adding your answer.

Traffic: 1622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6