Does anyone have a utility or pointers to sort fastq files based on the quality value?
TiA, Nash
Does anyone have a utility or pointers to sort fastq files based on the quality value?
TiA, Nash
I assume that you wish to sort the reads by average quality. This could be done easily with Heng Li's bioawk like so:
cat test.fq | awk -c fastx ' { print meanqual($seq),$name,$seq,$qual} ' | sort -k 1 -rn | awk ' { printf("@%s\n%s\n+\n%s\n",$2,$3,$4) } ' > sorted.fq
I wrote a mini help here: https://github.com/ialbert/bioawk/blob/master/README.bio.rst
Thanks Istvan...but it looks like there might be a typo somewhere? awk on my Ubuntu 12.04 LST doesn't seem to recognize the -c option??
Nash
Ok....thanks. I just tried make from your tar file, but it looks like this Ubuntu non-developer distribution really needs a lot of basic unix libs and tools (I had to get lex and yacc first!). Any idea where I can get zlib.h to compile addon.c please? Sorry for the inconvenience...
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Just to be sure, by sorting you mean, reads with
best average quality
should be on the top of the list? Rather, what is your criteria for sorting? And by sorting, do you meanno filtering
?