Samtools can generate genome sequencing concensus FASTQ from bam file. Sometimes there is need to convert this file to fasta for example to get genotypes in bed file positions using bedtools FastaFromBED command. There is a lot of one liners converting FASTQ to fasta, but they can't work with such big files.
So here is a php code from my collegue Nickolay Kulemin doing this conversion just great:
/* This script can convert FASTQ to FASTA for human genomes. It looks for lines, started with "@chr" and with "+" and prints them into FASTA format. Usage: php fastq_fasta.php input_file output_file. */
<?php
if ($argc != 3) die ("File_please (input output)!\n");
$fp = fopen("{$argv["1"]}", "r");
$fw = fopen("{$argv["2"]}", "w");
$type = 0;
$first = 0;
while(!feof($fp)) {
$type_old = $type;
$line = fgets($fp, 4000);
if (feof($fp)) break;
if (substr($line,0,4) == "@chr") {
$type = 1;
$lline = explode("\n", $line);
$gline = substr($lline["0"],1);
if ($first == 0) {
$ggline = ">$gline\n";
$first = 1;
} else $ggline = "\n>$gline\n";
fwrite ($fw, $ggline);
echo("$ggline");
}
if (substr($line,0,1) == '+') $type = 2;
if ((substr($line,0,1) == ' ')||(substr($line,0,1) == "\n")) $type = 0;
if (($type != $type_old)||($type != 1)) continue;
$line = trim($line);
$line = strtoupper($line);
fwrite($fw, $line);
}
fclose($fp);
fclose($fw);
?>
just didn't found it by googling )
and why not PHP ?