An ontology for mapping statistics ?

0

Entering edit mode

8.2 years ago

Charles Plessy ★ 2.9k

Hi Biostars, I am looking for an ontology that would describe mapping statistics typically produced by alignment pipelines, such as number of reads extracted, or mapped, or that are PCR duplicates, etc. I have not found anything with search engines...

My plan is to output quality-control files in Turtle format, for instance:

@prefix qc: <http://example.com/SuperDuperQcOntology/> .
<HeLa_cells_repl_1>    qc:extracted    2674435    .
<HeLa_cells_repl_1>    qc:mapped       1566239    .
<HeLa_cells_repl_1>    qc:pcrdup        634533    .
<HeLa_cells_repl_2>    qc:extracted    1406337    .
<HeLa_cells_repl_2>    qc:mapped        989553    .
<HeLa_cells_repl_2>    qc:pcrdup        373958    .
etc...

My hope is that it could benefit from SPARQL queries, while being easy to convert to tab-separated format for processing by simpler tools.

ontology mapping QC • 1.4k views

ADD COMMENT • link 7.5 years ago by Charles Plessy ★ 2.9k

Login before adding your answer.