Does ambiguous nucleotides affect analysis?
1
0
Entering edit mode
4.0 years ago
pfee418 ▴ 10

Hi guys, if there are some ambiguous nucleotides appear in your sequences, does it affect the sequence analysis in general? Usually researchers will remove ambiguous nucleotides because they produce noise and affect analysis, is it correct?

Or are there any importance of ambiguous nucleotides that we shouldn't ignore?

Thank you very much for all the comments in advance :)

ambiguousnucleotides • 1.3k views
ADD COMMENT
1
Entering edit mode

Please define sequence analysis.

ADD REPLY
0
Entering edit mode

Sequence analysis here involved coronavirus genome sequence analysis. I did multiple sequence alignment of the human coronaviruses and I saw there is a lot of ambiguous nucleotides in the alignment. The MSA done as a step for comparative genome analysis to compare human CoVs genomes and find the effects of the differences (indels, substitution and conservation) identified within the alignment. I'm not sure whether the ambiguous nucleotides should be ignored or not because a lot of them are 'N' which are unknown and a small number of them are Y, R, W, S, K and M, which is hard to predict the specific nucleotide.

My plan is to ignore them for now. After I identified the differences within the alignment and begin to study the effect of the identified genome region, I will look back the genome region if ambiguous nucleotides are present in the identified genome region. Thank you :)

ADD REPLY
1
Entering edit mode
4.0 years ago
Joe 21k

There is no absolute answer to this. Some tools will model ambiguous residues in sophisticated ways, and other tools will simply ignore them entirely. You will need to check with the specific tool you plan to use and the implications of this for whatever scientific question you're asking.

ADD COMMENT

Login before adding your answer.

Traffic: 2681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6