Entering edit mode
10.5 years ago
Abdullah
▴
100
Hi,
I used MITObim to assemble a mitochondrial genome from illumina high seq paired end reads.
I first merged the reads to one fastq file using PEAR and then I did the procedures described in the MITObim manual and everything went well.
The assembled mitogenome seems to be really good when I aligned it with a reference genome but I noticed many "x" and "n".
Does someone have an idea how can I deal with these unknown base pairs?
Most downstream tools will know how to handle these unknowns, so should not be a problem to just leave them there.
I did not get what do you mean? how should I know them and what should I use?
You asked "how can I deal with these unknown base pairs". The bottom line is you do not have to deal with them, most downstream tools can deal with them. It is common to have many Ns in assemblies.
Can you give me an example of these downstream tools that can tell me what are these Ns and Xs?
N is a IUPAC code for an unknown base, and Xs are probably masked bases but you need to check the documentation of your program to confirm this...