Identifying sample from FASTQ file
1
0
Entering edit mode
7.7 years ago

I have a very Old Fastq file that looks as-

@ILLUMINA-DB1410_0001:1:1:1092:14610#0/1
CTAAATAAGNCCTTTCCCCACCTGTTTGATTCTGTTTCCT
+ILLUMINA-DB1410_0001:1:1:1092:14610#0/1
bbbbbbbbbBbbbbbbbbbbbaabbbbbbbbbbbbbbbbb
@ILLUMINA-DB1410_0001:1:1:1092:7332#0/1
CACAGTCTCNTGGGGGATGAATGACAAGTGAGCAGAATGT
+ILLUMINA-DB1410_0001:1:1:1092:7332#0/1
bbbbbbbbbB^]^]_XUZ[WGOOOO`^^]^aaaaa_`aWa

Now I want to find out the sample as I have bar codes like s_5_CGTACG_sequence.txt s_5_GAGTGG_sequence.txt s_5_ACTGAT_sequence.txt

Is there any way that I can find out this FASTq file is associated with any of above three barcodes.

I read about FAStQ format but could not find any solution

Thanks

Fastq • 1.7k views
ADD COMMENT
0
Entering edit mode
7.7 years ago
GenoMax 147k

I am afraid you are correct that you can't tell what sample this is by looking at this data. As you noted there are no barcodes in the fastq headers (unless they were replaced by #0, which again won't help unless you have a key for that code).

If you had other independent data (e.g. if the three samples were from three distinct organisms/genomes or had specific SNP's which could be identified in the data by using a known reference) then you may be able to make some progress.

ADD COMMENT

Login before adding your answer.

Traffic: 2488 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6