Entering edit mode
6.3 years ago
pbigbig
▴
250
Hi,
I have assembled a genome, using Illumina Pair-end reads for assembling, Mate pair reads for scaffolding. In resulted fasta files, I notice some patterns like this:
...NNNNNNNNNNNNNNNNGTGTGTAGGATCTCACNNNNNNNNNNNNNNNNNNNNNN...
I would like to hardmask those small "island" sequences between gaps with defined maximum length (e.g masking if < 200bp), could you please give some suggestion?
Thank you very much in advance!
When you say masking do you mean remove it ?
Hi, I mean to hard masking it, which would turn any A C G T to N