I have a FASTA file in which I would like to introduce some N point mutations for further analyses, given afile with their positions in the corresponding sequences.
The FASTA file:
>ref1
AATGGTGCGGCGAGAGCCGCAGATTTGAGAGCC
>ref2
AAAAAATTTTTATCTCTCTTGGGCCCCGATAGACTCCGGGCCGA
And the position file, with TABS as delimiter:
ref1 10 G N
ref1 13 A N
ref1 20 C N
ref2 3 A N
ref2 15 T N
The desired output would be another FASTA file with the N mutations added in their corresponding places.
>ref1
AATGGTGCGNCGNGAGCCGNAGATTTGAGAGCC
>ref2
AANAAATTTTTATCNCTCTTGGGCCCCGATAGACTCCGGGCCGA
I have tried using bcftools consensus
building a dummy vcf structure from my positioning file, but it seems to be not working as it requires a true vcf header with info I cannot properly recreate...
Do you know of any other alternative wat to do this task with other online tools or manually via scripting??
I guess some python/awk/bash scripting might be able to get it done but I am not quite fluent enough on python to develop such a script...
Many thanks.
what's the problem here ? one need a few standard header lines, and adding a few columns
I tried several headers but none of them seemed to work... there was always an issue or another and ended up in a parsing error.
show us your code....