How to Add Mutations to the sequence
2
0
Entering edit mode
15 months ago

I have list of BDQ resistance mutations and I want to add those into the genome of MTB sequence or resistance gene.

some are nucleotide mutations some are protein I am not really sure how to add them and make a BDQ resistance gene sequence.

Mutation example:

1.Rv0678_274_ins_1_t_ta
2.Rv0678_I108V

Please help me with this. I am unsure what to do.

If not this any idea how can i get the BDQ resistance gene sequences. coz its not there in any AMR databases

mutations NGS sequence • 963 views
ADD COMMENT
1
Entering edit mode

BDQ = Bedaquline? MTB = Multi-drug resistant TB?

While you may be familiar with the short notation others on the forum are not likely going to be so it would be best to use long forms when posting.

how can i get the BDQ resistance gene sequences

It looks like multiple efflux pump genes are included in this resistance category though the one in your example is not on this list.

ADD REPLY
2
Entering edit mode
15 months ago

If all nucleotides, you can use seqtk mutfa

https://github.com/lh3/seqtk

  seqtk

    Usage:   seqtk <command> <arguments>
    Version: 1.4-r122

    Command: seq       common transformation of FASTA/Q


       size      report the number sequences and bases
         comp      get the nucleotide composition of FASTA/Q
         sample    subsample sequences
         subseq    extract subsequences from FASTA/Q
         fqchk     fastq QC (base/quality summary)
         mergepe   interleave two PE FASTA/Q files
         split     split one file into multiple smaller files
         trimfq    trim FASTQ using the Phred algorithm

         hety      regional heterozygosity
         gc        identify high- or low-GC regions
         mutfa     point mutate FASTA at specified positions
         mergefa   merge two FASTA/Q files
         famask    apply a X-coded FASTA to a source FASTA
         dropse    drop unpaired from interleaved PE FASTA/Q
         rename    rename sequence names
         randbase  choose a random base from hets
         cutN      cut sequence at long N
         gap       get the gap locations
         listhet   extract the position of each het
         hpc       homopolyer-compressed sequence
         telo      identify telomere repeats in asm or long reads

seqtk mutfa

Usage: seqtk mutfa <in.fa> <in.snp>

Note: <in.snp> contains at least four columns per line which are:
      'chr  1-based-pos  any  base-changed-to'.
ADD COMMENT
0
Entering edit mode

OP doesn't have genomic coordinates. Rv0678 is a gene.

ADD REPLY
0
Entering edit mode
15 months ago
bk11 ★ 3.0k

You can add mutation to your reference seuence using seqtk software like this-

cat mytest.fasta 
>seqA
TTGCATATCGTATATGCATGCATGCATGCA
>seqB
CTGATCGAGTCGATCGATGCTATATAGCAG

cat myMutation.snp
seqA    4   foo T
seqA    5   foo G
seqA    9   foo A
seqB    12  foo A
seqB    10  foo C
seqB    8   foo T

seqtk mutfa mytest.fasta myMutation.snp 
>seqA
TTGTGTATAGTATATGCATGCATGCATGCA
>seqB
CTGATCGTGCCAATCGATGCTATATAGCAG

#In order to keep the cases uppercase/lowercase) while changing sequences of reference nucleotide, you can add `--keepcase` option to your command-
seqtk mutfa --keepcase mytest.fasta myMutation.snp 
ADD COMMENT

Login before adding your answer.

Traffic: 2681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6