Entering edit mode
3.3 years ago
pollyyhjo
▴
10
What is the 'identifier' of the first read in both files? Here is the code I get. Also, what does this identifier of both reads tell us?
https://en.wikipedia.org/wiki/FASTQ_format
Specifically: https://en.wikipedia.org/wiki/FASTQ_format#Illumina_sequence_identifiers
So there is no
identifier
(as far as a sample ID goes) inside an Illumina file. You would normally have that information in the name of the file. If someone "coded" the names to be generic (like what you have) then you had better have a key/metadata file that links the index sequence you see in header (GGACTCCT+CTCCTTAC
) with a sample_ID/file names.So we cannot tell what identifier from the code above?
Identifier
for? If for sample, then no.But if you wanted to know what sequencer the sample ran on then you get the serial number
NB551191
. Flow cell serial number isHM5WHBGX5
. Data is from lane1
.