dealing with unknown base pairs after assembly

0

Entering edit mode

11.2 years ago

Abdullah ▴ 100

Hi,

I used MITObim to assemble a mitochondrial genome from illumina high seq paired end reads.

I first merged the reads to one fastq file using PEAR and then I did the procedures described in the MITObim manual and everything went well.

The assembled mitogenome seems to be really good when I aligned it with a reference genome but I noticed many "x" and "n".

Does someone have an idea how can I deal with these unknown base pairs?

Assembly • 3.2k views

ADD COMMENT • link updated 3.8 years ago by Ram 45k • written 11.2 years ago by Abdullah ▴ 100

0

Entering edit mode

Most downstream tools will know how to handle these unknowns, so should not be a problem to just leave them there.

ADD REPLY • link 11.2 years ago by DoubleDecker ▴ 200

0

Entering edit mode

I did not get what do you mean? how should I know them and what should I use?

ADD REPLY • link updated 3.8 years ago by Ram 45k • written 11.2 years ago by Abdullah ▴ 100

0

Entering edit mode

You asked "how can I deal with these unknown base pairs". The bottom line is you do not have to deal with them, most downstream tools can deal with them. It is common to have many Ns in assemblies.

ADD REPLY • link 11.2 years ago by DoubleDecker ▴ 200

0

Entering edit mode

Can you give me an example of these downstream tools that can tell me what are these Ns and Xs?

ADD REPLY • link updated 3.8 years ago by Ram 45k • written 11.2 years ago by Abdullah ▴ 100

0

Entering edit mode

N is a IUPAC code for an unknown base, and Xs are probably masked bases but you need to check the documentation of your program to confirm this...

ADD REPLY • link updated 3.8 years ago by Ram 45k • written 11.2 years ago by DoubleDecker ▴ 200

Login before adding your answer.