Complete beginner so I'm sorry if this is obvious!
I have a file which is name | +/- or IG_name | 0 in a long list.
S1 +
IG_S1 0
S2 -
IG_S3 0
S3 +
S4 -
dnaA +
IG_dnaA 0
Everything which starts with IG_ has a corresponding name. I want to add the + or - to the IG_name.
The information is gene names and strand information, IG = intergenic region. Basically I want to know which strand the intergenic region is on.
what I want:
open file
if starts with IG_*
find the line with *
print("IG_" and the line it found)
else
print(line)
what I have:
with open(sys.argv[2]) as geneInfo:
with open(sys.argv[1]) as origin:
for line in origin:
if line.startswith("IG_"):
name = line.split("_")[1]
nname = name[:-3]
for newline in geneInfo:
if re.match(nname, newline):
print("IG_"+newline)
else:
print(line)
where origin is the mixed list and geneInfo has only the names not IG_names.
With this code I end up with a list containing only the else statements.
S1 +
S2 -
S3 +
S4 -
dnaA +
My problem is that I don't know what is wrong to search!
What 2 files do you start with?
"where origin is the mixed list and geneInfo has only the names not IG_names"
So origin is the first example, and geneInfo has everything except the ones which start with IG.
What others are saying is that you should show just a few lines of each input file, then show the exact command as you invoke it. These to ingredients are necessary to troubleshoot.
Sorry, I should have made my other file more obvious! second file looks like this: