I am trying to write a code which asks for a file (if the first time, an invalid filename is given, it asks for file 5 times until exhausting), then it checks if the file is in fasta format.
how to code that? I have the following code so far.
#!/usr/bin/perl -w#A program that asks for a file, opens it if file exists and check#if the file is in FASTA format
use strict;#get data from a file
my @file = openfile();#open file#subroutines
sub openfile {
my $filename;
my $x;
my $datafile;
my $file;for($x= 0;$x<5;$x++){
print "\n\nPlease enter file name: ";
chomp ($filename=<STDIN>);if(-e $filename){
print "File found!\n\n";exit;}else{if($x<4){
print "Invalid file name!\n\n";}else{
print "Five tries were unsuccessful! Please check and try again!\n\n";}}}return;}
First, there is no need to reinvent the wheel. Use the SeqIO module from Bioperl:
#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;
my $seqio= Bio::SeqIO->new(-file =>"myfile.fa", -format =>"fasta");
while(my $seq=$seqio->next_seq){# do stuff with sequences...}
If the fasta file is invalid, this code will throw an exception, for example:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The sequence does not appear to be FASTA format (lacks a descriptor line '>')
Second, don't waste time checking for multiple incorrect attempts. Once is enough :)
This is a little simplistic. You should at least check whether > is at the start of the first line, using /^>/. Also, there should be a check for no space after >. And then there is the problem of valid sequence lines.
Thanks!!! I will try it out!