Now I have many FASTQ files, and I need to select those with sanger format. Advisor told me: zcat *.fastq.gz |grep "J" (seems there's no "J" in the information of sanger format; while you can always find "J" in other format, though I'm not sure about this) If the output is NOTHING, then it's sanger format that we want; if there's anything in the output, then it should be discarded.
Question is how I can write the script to finish this task? I'm beginner and never write code before.thanks!
I'm unclear on the question. Your advisor gave you a series of bash commands that will tell you whether the file is Phred+33 or Phred+64. That should be all you need. Are you trying to figure out how to parse the output?
BTW, for info on different formats, I like the [?]Wikipedia page on fastq format[?]. That's where your advisor got his "J" trick from.
I'm unclear on the question. Your advisor gave you a series of bash commands that will tell you whether the file is Phred+33 or Phred+64. That should be all you need. Are you trying to figure out how to parse the output? BTW, for info on different formats, I like the Wikipedia page on fastq format, http://en.wikipedia.org/wiki/FASTQ_format. That's where your advisor got his "J" trick from.
thx mitch, yes you are right. I'm trying to figure out how to parse output based on the trick advisor told me; ie., I hope sb can give me clues for the SCRIPT for parsing the output. I understand the logic but don't know how to write script.