Entering edit mode
7.5 years ago
lakshmi.bioinformatics
▴
30
I have around 30 k sequences and need to find the codon frequencies how to deal with the presence of Ns in the sequence
Should I need to mask the regions or to replace with specific nucleotides or try any other methods to overcome this issue? Will this manipulate my results?
I'd recommend masking these regions.
How to mask using Python program
Why use a Python program? Is this an assignment question?
These are NGS reads?
These are gene sequences retrieved from Ensembl database
Is it fasta format then? If this is an assignment, can you use biopython?
Yes the sequence is in fasta format and how to mask with biopython also I need to draw a CGR how can it be done with biopython