A way to get a list of determined and undetermined indexes using illumina interop library?
2
2
Entering edit mode
22 months ago

Hi, I would like to get the determined and undetermined indexes from the Illumina interop files. I can only seem to get the determined ones this way:

from interop import py_interop_run_metrics, py_interop_run, py_interop_summary

run_folder = "/path/to/run/folder"

run_metrics     = py_interop_run_metrics.run_metrics()
valid_to_load   = py_interop_run.uchar_vector(py_interop_run.MetricCount, 0)
py_interop_run_metrics.list_index_metrics_to_load(valid_to_load)

run_folder      = run_metrics.read(run_folder, valid_to_load)
summary         = py_interop_summary.index_flowcell_summary()
py_interop_summary.summarize_index_metrics(run_metrics, summary)

read_data = []
num_lanes = summary.size()
for read_num in range(num_lanes):
    lane_summary = summary.at(read_num)
    lane_data = []
    for lane in range(lane_summary.size()):
        lane_data.append(
            {
                'id':               lane_summary.at(lane).id(),
                'project_name':     lane_summary.at(lane).project_name(),
                'sample_id':        lane_summary.at(lane).sample_id(),
                'index1':           lane_summary.at(lane).index1(),
                'index2':           lane_summary.at(lane).index2(),
                'fraction_mapped':  lane_summary.at(lane).fraction_mapped(),
            }
        )
    read_data.append(lane_data)
print(read_data)

I believe they are in the Stats.json folder, so I could parse that manually but am curious if there is a way to get them through the interop lib.

Any help or clarification would be great!

interop python illumina • 1.5k views
ADD COMMENT
1
Entering edit mode
22 months ago

It turns out that out that Interop only includes known bar codes. bcl2fastq and bclconvert produces files that shown unknown barcodes.

ADD COMMENT
0
Entering edit mode
22 months ago

A follow up question, I was wondering if you knew how to get ALL the unknown barcodes, the bcl2fastq stats.json only has top 1000 unknown. Is there a way to get this? If anyone knows this I would be much appreciative!

ADD COMMENT
1
Entering edit mode

How about pulling them straight from I1+I2 of the Undetermined fastq.gz files, if you're running bcl2fastq anyway (and using --create-fastq-for-index-reads)?

ADD REPLY
0
Entering edit mode

That is by design. You could do what Jesse suggested.

ADD REPLY

Login before adding your answer.

Traffic: 2404 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6