Training dataset for NGS HLA typing (reads >200bp from PCR amplicons)
1
1
Entering edit mode
10.0 years ago

I'm looking for a training set from human HLA typing with long reads (454, IonTorrent or MiSeq 300bp) obtained by PCR (amplicon sequencing). I don't mind about MHC loci or if it's genomic or transcriptomic, but I need a dataset that contains:

- NGS reads
- Sequences of used primers
- Sequences of barcodes used to tag samples
- Reference genotypes of the samples to validate predictions (by Sanger sequencing or another well established method)

It's very hard to find any public data from literature. There a lot of papers about the topic, but most of them are from companies (for ex. Roche) and they don't publish the data.

Thanks in advance.

PD: HapMap and 1000 Genomes reads are not valid, they are not from PCR and they are too short ;)

NGS HLA Typing Amplicon • 2.8k views
ADD COMMENT
1
Entering edit mode
6.7 years ago
Ömer An ▴ 260

The references to HLA typing tools might help as they usually train their software on public datasets:

https://www.nature.com/articles/jhg2015102/tables/2

ADD COMMENT

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6