Hi all,
I should be receiving several million PE reads from multiple samples/lanes soon and I am wondering what format the files take.
I know they will be FASTQ but I am wondering do they generally come as one sample per file, one lane per file or something else? Also do the paired ends come in the same or different files?
I plan to align with BWA and it looks like it expects separate files for the paired ends. Is this correct? If samples/lanes/ends need to be separated into individual files, is there a standard way of doing this?
Thanks in advance.
Travis: yes, if you do paired-end sequencing, you get two files. The naming depends on technology or company, e.g. we get file names called 123456_s_N_[12]_lib.txt, where the first number is a serial number, N is lane (I believe), [12] is 1 or 2, i.e which end, and lib is a designation for the library used. But the contents is like Pierre says.
Thanks! Are there two files per sample?