To identify secretome proteins, one of the end stages of the pipeline requires a scripting solution.
I have a file with protein sequence names (one per line) of the same order as the protein sequence analysis tables in another file, shown below. What I need to do is if the sequence name from the file2 has a number above 0 in either the LocDB or PotLocDB column excluding the "Extracellular" row to delete that sequence ID from file1 and if possible produce a file like file3. I then have a secretome protein id file list.
file3
14024 M 3.0 En 2.0
File1:
14024
13025
File2:
Seq name: 14024, Length=261
Nuclear 0.0 0.0 0.00 0.00 0.26
Plasma membrane 0.0 0.0 0.76 0.39 3.70
Extracellular 0.0 0.0 0.32 0.13 1.06
Cytoplasmic 0.0 0.0 0.03 0.10 0.61
Mitochondrial 3.0 0.0 1.08 0.43 2.90
Endoplasm. retic. 2.0 0.0 0.22 0.37 0.43
Peroxisomal 0.0 0.0 0.06 0.10 0.22
Lysosomal 0.0 0.0 0.08 0.11 0.00
Golgi 0.0 0.0 0.18 1.12 0.57
Vacuolar 0.0 0.0 0.28 0.39 0.25
Seq name: 13025, Length=247
Nuclear 0.0 0.0 1.63 0.00 0.77
Plasma membrane 0.0 0.0 0.06 0.48 0.93
Extracellular 0.0 0.0 0.19 0.35 0.87
Cytoplasmic 0.0 0.0 0.26 0.04 0.18
Mitochondrial 0.0 0.0 1.16 1.34 6.67
Endoplasm. retic. 0.0 0.0 0.09 0.27 0.00
Peroxisomal 0.0 0.0 0.01 0.04 0.07
Lysosomal 0.0 0.0 0.00 0.13 0.03
Golgi 0.0 0.0 0.04 0.21 0.48
Vacuolar 0.0 0.0 0.00 0.21 0.00
This is the beginning of some bad code but not finished yet which why trying on here for some people who can script better than me
#!/usr/bin/perl -w
use strict;
my $idsfile = "RRESID.txt";
my $seqfile = "ProtComp_RRES_ALL_correct2.txt";
my %ids = ();
open FILE2, $idsfile;
while(<FILE2>) {
open FILE, $seqfile;
while(<FILE>) {
for (my $i=0; $i <= 10; $i++) {
chomp;
my @col = split /\t/;
if($col[1]>0){
if(i=0){
}elsif(i=1){
}elsif(i=2){
}elsif(i=3){
}elsif(i=4){
}elsif(i=5){
}elsif(i=6){
}elsif(i=7){
}elsif(i=8){
}elsif(i=9){
}
if($col[2]>0){
if(i=0){
}elsif(i=1){
}elsif(i=2){
}elsif(i=3){
}elsif(i=4){
}elsif(i=5){
}elsif(i=6){
}elsif(i=7){
}elsif(i=8){
}elsif(i=9){
}
}
$counter1=$counter1+1;
}
}
close FILE;
$counter1=$counter1+3;
}
close FILE2;
Could you indicate what you have tried, please
Updated what I have so far but I'm just beginning and would take me a week as I'm slow at this :( I'll update what I have Friday and hopefully made more progress with it. I should say I'm posting to see if there is a simple script, command line code that I am not aware. I could only think of two loops using perl.