I'm trying to figure out how to genotype an SNP at a particular position. Honestly I'm not sure what that means, but here are the instructions:
"The idea is that the user wants to genotype a SNP at a particular position. To do this they will need to amplify a sequence of DNA around 600-800 bp long. It is also important that the beginning and end of the PCR product is at least 100 bp away. A visual schematic is included below. The first primer should be chosen between 0-400. The second primer should be chosen between 600-1000. The total product size should be 600-800 bp long."
This is the code I got help writing to attempt to get it to work
import argparse, pdb
import subprocess
from pyfasta import Fasta
# Create the argument parser
parser = argparse.ArgumentParser(description="Automatic Primer Identification")
parser.add_argument("Fasta_filename", help="the name of a fasta file", type=str)
parser.add_argument("Chromosome_of_interest", help="The chromosome we want to amplify", type=str)
parser.add_argument("Position_of_amplification", help="Position we want amplified", type=int)
#parser.add_argument("--product_length", type=int, default=700, help="Product length (default: 700)")
args = parser.parse_args()
# Open the fasta file and read in the sequence
F = Fasta(args.Fasta_filename)
# Identify the sequence 500 bp upstream and downstream of the requested position
req_seq = F[args.Chromosome_of_interest][args.Position_of_amplification - 500:args.Position_of_amplification + 500]
# Add braces to the DNA sequence 100 bp upstream and downstream of the requested position
# Create an input file containing the DNA sequence and specify a product length of 600 - 800 base pairs
with open('primer.txt', 'w') as f:
f.write("SEQUENCE_ID=test\n")
f.write("SEQUENCE_TEMPLATE=" + req_seq+ "\n")
f.write("PRIMER_PICK_LEFT_PRIMER= 1"+"\n")
f.write("SEQUENCE_TARGET= 100,200"+"\n")
f.write("PRIMER_PICK_RIGHT_PRIMER=1"+"\n")
f.write("PRIMER_PRODUCT_SIZE_RANGE=" + "600-800"+"\n")
f.write("SEQUENCE_INTERNAL_EXCLUDED_REGION="+'400'+","+ '600'+"\n")
f.write("=\n")
f.close()
#print(F[args.Chromosome_of_interest])
# Execute the primer3_core command on the input file you created
# primer3_command = ["primer3_core", "primer.txt"]
primer3_command = 'primer3_core < primer.txt'
subprocess.check_output(primer3_command, shell=True, encoding = 'utf-8')
# Read in and parse the output to identify the sequence of the best two primers
output = subprocess.check_output(primer3_command, shell=True, encoding = 'utf-8')
primer_output = output.split('\n')
best_primers = []
for i, line in enumerate(primer_output):
if line.startswith("PRIMER_LEFT_0_SEQUENCE="):
best_primers.append(line.split('=')[1])
elif line.startswith("PRIMER_RIGHT_0_SEQUENCE="):
best_primers.append(line.split('=')[1])
if len(best_primers) == 2:
break
# Print the best two primers to the user
for i, primer in enumerate(best_primers):
print(f"{primer}")
But I am not getting the primers I'm suppose to. I was told that SEQUENCE_TARGET is incorrect and that I should use the same values as SEQUENCE_INTERNAL_EXCLUDED_REGION but I end up getting no primers at all.
how about trying to remove the space after "SEQUENCE_TARGET=" ?
You mean like this? f.write("SEQUENCE_TARGET= 100,200\n") I'm still getting the wrong output. I was told by my TA that I should use the same values for I used for SEQUENCE_INTERNAL_EXCLUDED_REGION
I've been trying since Monday to get this program to work. Does anyone have any suggestions?
not
It didn't make a difference
I changed to f.write("SEQUENCE_TARGET= 100,200"+"\n") to ("SEQUENCE_TARGET= 400,600"+"\n") The base for the left primer starts 4 bases to the right of where it should start
cross-posted: https://stackoverflow.com/questions/77495181/