Insert SNPs in FASTA sequence given position file
0
0
Entering edit mode
2.7 years ago
Emilio Marmol ▴ 180

I have a FASTA file in which I would like to introduce some N point mutations for further analyses, given afile with their positions in the corresponding sequences.

The FASTA file:

>ref1
AATGGTGCGGCGAGAGCCGCAGATTTGAGAGCC
>ref2
AAAAAATTTTTATCTCTCTTGGGCCCCGATAGACTCCGGGCCGA

And the position file, with TABS as delimiter:

ref1   10    G    N  
ref1   13    A    N
ref1   20    C    N
ref2    3    A    N
ref2   15    T    N

The desired output would be another FASTA file with the N mutations added in their corresponding places.

>ref1
AATGGTGCGNCGNGAGCCGNAGATTTGAGAGCC
>ref2
AANAAATTTTTATCNCTCTTGGGCCCCGATAGACTCCGGGCCGA

I have tried using bcftools consensus building a dummy vcf structure from my positioning file, but it seems to be not working as it requires a true vcf header with info I cannot properly recreate...

Do you know of any other alternative wat to do this task with other online tools or manually via scripting??

I guess some python/awk/bash scripting might be able to get it done but I am not quite fluent enough on python to develop such a script...

Many thanks.

snp python bed awk fasta • 726 views
ADD COMMENT
0
Entering edit mode

it requires a true vcf header with info I cannot properly recreate.

what's the problem here ? one need a few standard header lines, and adding a few columns

ADD REPLY
0
Entering edit mode

I tried several headers but none of them seemed to work... there was always an issue or another and ended up in a parsing error.

ADD REPLY
0
Entering edit mode

show us your code....

ADD REPLY

Login before adding your answer.

Traffic: 2264 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6