Please give me a lecture on illumina headers....
I have been down loading NGS ChIP-Seq data to run comparative studies on our ChIp-Seq data. I have noticed that the headers on the data I have been down loading from NCBI is very different from our header. For example here is the header for a run from the united states sequence read archive (SRR) SRR1747943…. Notice that the header has the SRR number in it.
@SRR1747943.1.1 1 length=36
CTATTAAGTGACCTGAGTGGCAGGAAGAAGTAGCGC
+SRR1747943.1.1 1 length=36
HHHHHHHHHHHHHHHHHHHHGEGG############
Here is an example of the header from one of our runs. It has the standard header information including the sequencer identifier, etc.., etc..
@HWI-ST425:160:D1JFWACXX:3:1101:1247:1946 1:N:0:GCCAAN
NAAACTCCTTCATGAAGCTGATACAAGATGTCATGAATTGTNTTGCATCTGNNNATCTTCTGAGNNNNNNNNNNNAAAAGCATCACATTNNNNNNNNCCTT
+
.#4=DDFFFHHHHHJJJIIJJJJJJJJJIIIHHIJJJIIJJJ#1?FGHIHJI###00-BFGIGGG###########,,5;A>CD;@CDDC############
My question is was the header changed at SRR when the data was deposited, or was the header changed by the person who deposited the sequence? Is there some way I have find the original header for the SRR1747943 run?