Add braces to the DNA sequence for re quested primer position.
0
0
Entering edit mode
14 months ago
Keith • 0

Good afternoon

Currently I'm working on this bit of code for the purpose of input DNA code that's attempting to isolate a specific DNA based on it's location upstream and downstream. The issue I am having is with the following instructions:

"Add braces to the DNA sequence 100 bp upstream and downstream of the requested position."

This is what I have so far:

parser = argparse.ArgumentParser(description = "This takes in input_file")
parser.add_argument('-I','--Fasta_filename', help='the name of a fasta file', type=str)
parser.add_argument('-C','--Chromosome_of_interest', help='The chromosome we want to amplify', type=str) 
parser.add_argument('-P','--Position_of_amplification', help='Position we want amplified', type=int)
args = parser.parse_args()  

"""
Open the fasta file and read in the entirety of the sequence. You can use whatever data structure you prefer.
"""

F = Fasta(args.Fasta_filename)

"""
Identify the sequence 500 bp upstream and downstream of the requested position and store it as a string.
"""

req_pos = F[args.Chromosome_of_interest][args.Position_of_amplification-501:args.Position_of_amplification+501]

I am not sure how to implement

python subprocess • 1.0k views
ADD COMMENT
0
Entering edit mode

Why would you want to add braces to a sequence? Is the end goal just to print the sequence formatted that way, because adding non ATGC/IUPAC characters to a sequence is just data corruption.

Is this an assignment on string manipulation?

ADD REPLY
0
Entering edit mode

Yes it is. The file is entered as a string and the goal is to locate a primer and output that location.

Alternatively, you can use an argument in the input file to specify this

I was thinking of using process.run()

ADD REPLY
0
Entering edit mode

Why are you looking to run a process/subprocess when string manipulation of the sort you're looking for is easier in a programming environment?

ADD REPLY
0
Entering edit mode

I asked for help with an assignment which I clearly stated, stop asking irrelevant questions.

ADD REPLY
0
Entering edit mode

Your assignment says "add braces". It does not say "use python to spawn a shell process that adds braces". My question is about your approach, which is very much relevant. What is apparent is your poor attitude, which does not bode well for you on a volunteer driven forum.

ADD REPLY
0
Entering edit mode

I got a lot of help and figured it out. But now I am getting this error message:

Traceback (most recent call last):
  File "/mnt/c/Users/ctcen/OneDrive/Desktop/results_5/Assignment5_Solution.py", line 37, in <module>
    subprocess.run(primer3_command, check=True, capture_output=True, text=True)
  File "/home/kecreech3/anaconda3/envs/class/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['primer3_core', 'primer.txt']' returned non-zero exit status 255.

This is my code:

import argparse
import subprocess
from pyfasta import Fasta

# Create the argument parser
parser = argparse.ArgumentParser(description="Automatic Primer Identification")
parser.add_argument("Fasta_filename", help="the name of a fasta file", type=str)
parser.add_argument("Chromosome_of_interest", help="The chromosome we want to amplify", type=str)
parser.add_argument("Position_of_amplification", help="Position we want amplified", type=int)
parser.add_argument("--product_length", type=int, default=700, help="Product length (default: 700)")
args = parser.parse_args()

# Open the fasta file and read in the sequence
F = Fasta(args.Fasta_filename)

# Identify the sequence 500 bp upstream and downstream of the requested position
req_seq = F[args.Chromosome_of_interest][args.Position_of_amplification - 501:args.Position_of_amplification + 501]

# Add braces to the DNA sequence 100 bp upstream and downstream of the requested position
new_seq = ""
for i, b in enumerate(req_seq):
    if i == args.Position_of_amplification - 101:
        new_seq += "["
    elif i == args.Position_of_amplification + 101:
        new_seq += "]"
    new_seq += b

# Create an input file containing the DNA sequence and specify a product length of 600 - 800 base pairs
with open('primer.txt', 'w') as f:
    f.write("SEQUENCE_ID=sample_sequence\n")
    f.write("SEQUENCE_TEMPLATE=" + new_seq + "\n")
    f.write("SEQUENCE_TARGET=101,200\n")
    f.write("PRIMER_PRODUCT_SIZE_RANGE=600-800\n")

# Execute the primer3_core command on the input file you created
primer3_command = ["primer3_core", "primer.txt"]
subprocess.run(primer3_command, check=True, capture_output=True, text=True)

# Read in and parse the output to identify the sequence of the best two primers
output = subprocess.run(primer3_command, check=True, capture_output=True, text=True)
primer_output = output.stdout.split('\n')

best_primers = []
for i, line in enumerate(primer_output):
    if line.startswith("PRIMER_LEFT_0_SEQUENCE="):
        best_primers.append(line.split('=')[1])
    elif line.startswith("PRIMER_RIGHT_0_SEQUENCE="):
        best_primers.append(line.split('=')[1])
    if len(best_primers) == 2:
        break

# Print the best two primers to the user
print("Best Primers:")
for i, primer in enumerate(best_primers):
    print(f"Primer {i + 1}: {primer}")
ADD REPLY

Login before adding your answer.

Traffic: 2222 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6