vcftools not ouputting log file when run from perl
1
0
Entering edit mode
5.0 years ago
dec986 ▴ 380

I am running 325 vcftools commands to generate Fst values, which obviously needs to be automated.

An example:

vcftools --vcf big.vcf --weir-fst-pop pop_lists/pop1.txt --weir-fst-pop pop_lists/pop2.txt  --out weir_fst_results/pop1_vs_pop2

and when I run this job, it works fine when I run it one by one by the command line, i.e. there are two files:

pop1_vs_pop2.log

and

pop1_vs_pop2.weir.fst

I need to get the Fst values from the log file.

The problem is that the log files are not generated when I run from my perl script, even though the commands executed are identical.

This sounds so trivial, but if it's not producing the log file, I'm wasting my time.

How can I get vcftools to produce the log file when executing from a script? why isn't vcftools producing the log file?

Perhaps there is a way of calculating this that doesn't involve using vcftools?

the script to run them:

sub execute {
    my $command = shift;
    print "Executing: $command\n";
    if (system($command) != 0) {
        say "$command failed.";
        die
    }
}

my @pop = list_regex_files('^[A-Y]{3}\.txt$', 'ethnicity_lists');
my $outdir = 'weir_fst_results';
if (not -d $outdir) {
    mkdir $outdir
}
my $vcf = 'big.vcf';
if (not -e $vcf) {
    say "$vcf doesn't exist.";
    die
}
foreach my $pop1 (0..$#pop) {
    my $label1;
    if ($pop[$pop1] =~ m/^ethnicity_lists\/([A-Y]{3})\.txt$/) {
      $label1 = $1
   } else {
      say "$pop[$pop1] failed regex.";
      die;
   }
    foreach my $pop2 (($pop1+1)..$#pop) {
        my $label2;
        if ($pop[$pop2] =~ m/^ethnicity_lists\/([A-Y]{3})\.txt$/) {
            $label2 = $1
        } else {
            say "$pop[$pop2] failed regex.";
            die;
        }
        my $out_prefix = "weir_fst_results/$label1" . "_vs_$label2";
        my $cmd = "vcftools --vcf $vcf ";
        $cmd .= "--weir-fst-pop $pop[$pop1] --weir-fst-pop $pop[$pop2] ";
        $cmd .= " --out $out_prefix > $out_prefix.out";
        execute($cmd);
    }
}

If I try to execute each comparison in a separate directory, with a separate nohup command, unfortunately all command output is still routed to the source nohup.out, and then the log file is empty, which defeats the purpose of running this.

vcftools vcf • 3.3k views
ADD COMMENT
0
Entering edit mode

Can you show us the relevant section of your perl script that runs these lines?

ADD REPLY
0
Entering edit mode

@RamRS I've updated the question

ADD REPLY
0
Entering edit mode

Try echo instead of execute and check the command. Also, look at the manual page for execute just to make sure STDIN and STDOUT are not being interfered with (not that it makes a difference when the out is explicitly specified, but just in case)

ADD REPLY
0
Entering edit mode

Have you checked if list_regex_files() does work properly?

What is stored in @pop after running line

my @pop = list_regex_files('^[A-Y]{3}\.txt$', 'ethnicity_lists');

?

ADD REPLY
0
Entering edit mode

@zubenel list_regex_files just contains a list of files, nothing else

ADD REPLY
0
Entering edit mode

A good way of debugging this is to print out the $cmd variable. I guess that it will be different from what you expect.

ADD REPLY
0
Entering edit mode

the problem is that the exact same command works differently when executed from the perl script. The output file is absent (and thus it is pointless to run vcftools at all). Is there a different program that calculates Fst? Vcftools does not work very well.

ADD REPLY
0
Entering edit mode

because the output file is absent, but the .fst file is present, is there a way that I can get the weighted score from the fst file?

ADD REPLY
0
Entering edit mode

Try using system() instead of execute().

ADD REPLY
0
Entering edit mode

unfortunately, that didn't change anything. Is vcftools really the only tool available that can calculate Fst? This problem sounds so trivial, but it's useless without that log file.

ADD REPLY
0
Entering edit mode

I'm not really sure. Do Fst files contain F-statistics like this one? If so, you can use plink.

It is definitely odd that a command that works from the shell does not work from a system() call. Can you try running the command from a non-interactive shell?

bash --norc --noprofile -c '<command>'

If the above command fails, then we will know the reason the perl call and the shell command work differently - that the shell command relies on some sort of alias or profile setting.

ADD REPLY
0
Entering edit mode

I thought that it might be my perlbrew installation, but that doesn't appear to be the case. The first cycle of parallel runs writes the log files, after that, no more log files, and no explanation. The commands are run identically. Looks like I have to give those other programs a try. Thanks for your help

ADD REPLY
0
Entering edit mode

"No explanation" could be because STDERR is being suppressed by perl. Try executing the command from perl but this time, suffix the command with >cmd.log 2>&1. This will capture STDERR and STDOUT to the cmd.log file, which will reveal any problems.

ADD REPLY
0
Entering edit mode

I'm running a test now, it should finish in about a day I think (on a small data set)

ADD REPLY
0
Entering edit mode

the problem is that at some point vcftools stops making the output file, and switches the important info to STDERR. I've never seen another program do this before.

ADD REPLY
0
Entering edit mode

That's odd, but then this behavior should have been replicated when you ran it from the shell as well. Why do you think that did not happen?

ADD REPLY
1
Entering edit mode

Some programs might try to determine if they are running in an interactive shell or not. Under some circumstances, Perl runs programs via execvp instead of using the shell, see https://perldoc.perl.org/functions/system.html This should only happen in case there are no shell metacharacters. You have ">" but to force a shell, you might try "/bin/sh -c " to force an additional layer, just in case the log file generation depends on how the program is invoked.

ADD REPLY
0
Entering edit mode

Hello!

Right now i´m having the same trouble using vcftools in a Nextflow pipeline. Did you find an answer?

ADD REPLY
0
Entering edit mode

If your question is more NextFlow related then you may want to ask it in their slack.

ADD REPLY
0
Entering edit mode
2.7 years ago

Try this:

vcftools --vcf big.vcf --weir-fst-pop pop_lists/pop1.txt --weir-fst-pop pop_lists/pop2.txt  --out weir_fst_results/pop1_vs_pop2 2>&1 | tee -a pop1_vs_pop2.log
ADD COMMENT

Login before adding your answer.

Traffic: 2888 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6