I am not a Perl coder and hope to find help in this forum. I aim to run the script.pl, shown below on the following input but do not get the desired output:
here is my desired output:
>Pseudo_TTA 33-35
AGATCTGTAAGCTCAAAATGGTAGAGCTCTCGTTATGAGAGTTTGGTGGTTCGAATCCACCCAGATCTGcca
>His_GTG 34-36
GTGGCTGTAGTTTAGTGGTGAGAATTCCACGTTGTGGCCGTGGAGACCTGGGCTCGAATCCCAGCAGCCACAcca
>Trp_CCA 33-35
GGATCCGTGGCGCAATGGTAGCGCGTCTGACTCCAGATCAGAAGGTTGCGTGTTCGATTGACGTCGGGTTCAcca
>Met_CAT 35-37
GGGGTGGTGGCGCAGTTGGCTAGCGCGTAGGTCTCATAATCCTGAGGTCGAGAGTTCGAGCCTCTCTCACCCCAcca
input file1: tRNAfasta.fa
>chrA01_622419_622487_+_Pseudo_TTA_28.44
AGATCTGTAAGCTCAAAATGGTAGAGCTCTCGTTATGAGAGTTTGGTGGTTCGAATCCACCCAGATCTG
>chrA01_2528533_2528604_+_His_GTG_63.70
GTGGCTGTAGTTTAGTGGTGAGAATTCCACGTTGTGGCCGTGGAGACCTGGGCTCGAATCCCAGCAGCCACA
>chrA01_2624977_2625048_+_Trp_CCA_69.73
GGATCCGTGGCGCAATGGTAGCGCGTCTGACTCCAGATCAGAAGGTTGCGTGTTCGATTGACGTCGGGTTCA
>chrA01_21878294_21878379_+_Met_CAT_61.56
GGGGTGGTGGCGCAGTTGGCTAGCGCGTAGGTCTCATAGCTATCTGAGTAATCCTGAGGTCGAGAGTTCGAGCCTCTCTCACCCCA
Input file 2: codonpos.txt
chrA01.trna1 (622419-622487) Length: 69 bp
Type: Sup Anticodon: TTA at 33-35 (622451-622453) Score: 28.44
Possible pseudogene: HMM Sc=-4.46 Sec struct Sc=32.90
* | * | * | * | * | * | *
Seq: AGATCTGTAaGCTCAAaaTGGTAGAGCTCTCGTTATGAGAGTtTGGTGGTTCGAATCCACCCAGATCTG
Str: >>>>>>>...>>>>.........<<<<.>>>.....<<<.....>>>>>.......<<<<<<<<<<<<.
chrA01.trna2 (2528533-2528604) Length: 72 bp
Type: His Anticodon: GTG at 34-36 (2528566-2528568) Score: 63.70
* | * | * | * | * | * | * |
Seq: GTGGCTGTAGTTTAGTGGTgAGAATTCCACGTTGTGGCCGTGGAGACCTGGGCTCGAATCCCAGCAGCCACA
Str: >>>>>>>..>>>>........<<<<.>>>>>.......<<<<<....>>>>>.......<<<<<<<<<<<<.
chrA01.trna3 (2624977-2625048) Length: 72 bp
Type: Trp Anticodon: CCA at 33-35 (2625009-2625011) Score: 69.73
* | * | * | * | * | * | * |
Seq: GGATCCGTGGCGCAATGGTAGCGCGTCTGACTCCAGATCAGAAGGtTGCGTGTTCGATTGACGTCGGGTTCA
Str: >>>>>>>..>>>>.......<<<<.>>>>>.......<<<<<.....>>>>.........<<<<<<<<<<<.
chrA01.trna27 (21878294-21878379) Length: 86 bp
Type: Met Anticodon: CAT at 35-37 (21878328-21878330) Score: 61.56
Possible intron: 39-50 (21878332-21878343)
* | * | * | * | * | * | * | * | *
Seq: GGGGTGGTGGCGCAGTTGGCtAGCGCGTAGGTCTCATAgctatctgagtaATCCTGAGGtCGAGAGTTCGAGCCTCTCTCACCCCA
Str: >>>>>>>..>>>>.........<<<<.>>>>.....................<<<<.....>>>>>.......<<<<<<<<<<<<.
here is the command:
perl script.pl tRNAfasta.fa codonpos.txt > out.fa
script.pl:
$fa = shift;
open(FA,$fa);
while($line = <FA>){
chomp $line;
if($line =~ />/){
$header = $line;
}
my($tRNA,$chrID,$loc,$strand,$aa,$codon,$len,$pb,$sc,$score) = split(" ",$header);
$tRNA =~ s/>Mus_musculus_tRNA-//g;
$chrID =~ s/[()]//g;
my($id,$name) = split("-",$chrID);
$tRNA_id_hash{$id} = $tRNA;
}
$sort = shift;
open(SORT,$sort);
while($name = <SORT>){
chomp $name;
$aminoAcidInfo = <SORT>;
chomp $aminoAcidInfo;
my($type,$aa,$anti,$anti_nt,$at,$anticodon_loc,$chr) = split(" ",$aminoAcidInfo);
$HMM = <SORT>;
if($HMM =~ /intron/){
$intron = <SORT>;
}
$star = <SORT>;
$seq = <SORT>;
chomp $seq;
$seq =~ s/Seq: //g;
$seq = $seq."CCA";
$seq = uc($seq);
$structure = <SORT>;
$blank = <SORT>;
my($id,$len) = split(/\t/,$name);
my($chrID,$loc) = split(" ",$id);
print ">$tRNA_id_hash{$chrID}\t$anticodon_loc\n$seq\n";
}
Here is the current output which is not at all what I desired :-( I tried to make sense of it but do not get it.
> 33-35
AGATCTGTAAGCTCAAAATGGTAGAGCTCTCGTTATGAGAGTTTGGTGGTTCGAATCCACCCAGATCTG
CCA
> 34-36
STR: >>>>>>>..>>>>........<<<<.>>>>>.......<<<<<....>>>>>.......<<<<<<<<<<<<.
CCA
> |
CCA
>
CCA
What is the use of input file1 here? It seems we can generate desired output from input file 2,
codonpos.txt
.true. But the script is the only thing on hand and it requires the input of both file 1 and file 2.
if you can explain what the pseudo code/algorithm is or whatever your are trying to achieve in a little more detail, may be I can help you. Waiting for your response!
a) Reformat the header of the fasta sequences to be: >tRNA_name predicted_anti-codon_positions b) Remove predicted introns c) add CCA to the sequence tail
does it help`?
A general advice: add comments to your code.