I have a question on perl coding to create a subset of a fastq file. I most take every 40th seqence from a larger file. The oringinal file is so big I cant use arrays and such.
I have this code
use strict;
use warnings;
open (my $fh,"fast.fastq") or die "Failed to open file: $!\n";
open (my $out_fh, "> out10.fastq");
while(<$fh>) {
chomp;
#$. holds the number of the last read line
#every 40 seq
next if $. <= 40;
my $header_line = $_;
my $seq = <$fh>;
my $plus = <$fh>;
my $qualities = <$fh>;
print $out_fh "$header_line$seq$plus$qualities";
}
close $fh;
close $out_fh;
it skips the first 40 lines which isn't much good to me. And then prints out the rest of the file which also isn't much good to me.
I am looking for a way to loop it so I get every 40th seq not line and all the ways through the file to get the subset.
I have a fastq file. I want to take out every other 40th sequence and put them in a new file. A small example would be, if I had a fastq file of 12 sequences and want every 4th.
160 = 40 * 4 and 156,157,158,159 are the indexes of the 40th line in the fastq file.
#!/usr/bin/env python
with open('reads.fastq') as f:
with open('output.fastq) as out:
[out.write(line) if (i % 160 in [156,157,158,159]) for i, line in enumerate(f)]
Thanks but didn't work, it just print out 40th sequence no others
Thanks but didn't work. it only took lines 3 to 6 then line 32 to33
Changed solution to one from a language I'm more familiar with :)
I changed 3 to 1 and it worked. Thanks
Ok, great :) It also seems that you need to change
to