Entering edit mode
8.5 years ago
darkmatter1996
•
0
I have been analyzing the data from the Xu et. al (2015) paper (see Supplemental Table_1.xlsx here). However, it appears that many of the sgRNAs shown are not complementary to any part of the targeted gene indicated. For example, I downloaded the NCBI sequence data for the gene RPL17, and was not able to find the sgRNA strand TTTCTTCTGGGCAACCTCCTCTTCTGGTTTAGGAACAATC, which is inside the data set, complementary to anywhere inside the gene. I wrote the following python function for finding the position of the sgRNA in the gene, which returns -1 if no match is found:
def guide_positional_features(guide_seq, gene, strand):
guide_seq = Seq.Seq(guide_seq)
f = open("data/sequences/{}.txt".format(gene), "r")
gene_seq = f.read()
gene_seq = Seq.Seq(gene_seq).reverse_complement()
if strand == '+': # sense strand
guide_seq = guide_seq.reverse_complement()
ind = gene_seq.find(guide_seq)
return ind
Am I doing something wrong? Thanks.