HTSLIB/VCF/C : get the first variant of each chromosome using htslib
0
0
Entering edit mode
3 months ago

Hi all,

I want to get the position of the first variant of each chromosome in a VCF using htslib. This is my code so far:

int scan(const char* fname) {
    khint_t k;
    bcf_srs_t *sr = bcf_sr_init();
    bcf_sr_set_opt(sr, BCF_SR_PAIR_LOGIC, BCF_SR_PAIR_BOTH_REF);
    bcf_sr_set_opt(sr, BCF_SR_REQUIRE_IDX);
    bcf_sr_add_reader(sr,fname);

    bcf_hdr_t* hdr = sr->readers[0].header;

    hdict_t *d = (hdict_t*)hdr->dict[BCF_DT_CTG];
    int nseq = kh_size(d);
    int rid;
    for (rid=0;rid< nseq;++rid) {
        const char* contig_name = hdr->id[BCF_DT_CTG][rid].key;
        bcf_hrec_t *hrec = hdr ? bcf_hdr_get_hrec(hdr, BCF_HL_CTG, "ID", contig_name, NULL) : NULL;
        int hkey = hrec ? bcf_hrec_find_key(hrec, "length") : -1;
        int high = atoi(hrec->vals[hkey]);
        fprintf(stderr,"%s\t%s = %d\n", contig_name , hkey<0?".":hrec->vals[hkey],high);
        char tmp[1000];
        sprintf(tmp,"%s:1-%d",contig_name,high);
        bcf_sr_set_regions(sr,tmp, 0);
        while ( bcf_sr_next_line(sr) ) {
            bcf1_t *line = sr->readers[0].buffer[0];
            fprintf(stderr,"%d\t%d\n", line->rid ,line->pos);
            break;
            }
        }
    if ( sr->errnum ) ERROR("Error: %s\n", bcf_sr_strerror(sr->errnum));
    bcf_sr_destroy(sr);
    return EXIT_SUCCESS;
    }
`

the problem is that the bcf_sr_next_line is not reset for each call of bcf_sr_set_regions, the break statement is useless, the iterator continues for the first chromosome .

So , is there a method in htslib to reset a bcf_sr_ iterator before calling bcf_sr_set_regions again ?

thanks, P

htslib vcf c • 462 views
ADD COMMENT
1
Entering edit mode

tagging: John Marshall

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6