Entering edit mode
3 months ago
Pierre Lindenbaum
164k
Hi all,
I want to get the position of the first variant of each chromosome in a VCF using htslib. This is my code so far:
int scan(const char* fname) {
khint_t k;
bcf_srs_t *sr = bcf_sr_init();
bcf_sr_set_opt(sr, BCF_SR_PAIR_LOGIC, BCF_SR_PAIR_BOTH_REF);
bcf_sr_set_opt(sr, BCF_SR_REQUIRE_IDX);
bcf_sr_add_reader(sr,fname);
bcf_hdr_t* hdr = sr->readers[0].header;
hdict_t *d = (hdict_t*)hdr->dict[BCF_DT_CTG];
int nseq = kh_size(d);
int rid;
for (rid=0;rid< nseq;++rid) {
const char* contig_name = hdr->id[BCF_DT_CTG][rid].key;
bcf_hrec_t *hrec = hdr ? bcf_hdr_get_hrec(hdr, BCF_HL_CTG, "ID", contig_name, NULL) : NULL;
int hkey = hrec ? bcf_hrec_find_key(hrec, "length") : -1;
int high = atoi(hrec->vals[hkey]);
fprintf(stderr,"%s\t%s = %d\n", contig_name , hkey<0?".":hrec->vals[hkey],high);
char tmp[1000];
sprintf(tmp,"%s:1-%d",contig_name,high);
bcf_sr_set_regions(sr,tmp, 0);
while ( bcf_sr_next_line(sr) ) {
bcf1_t *line = sr->readers[0].buffer[0];
fprintf(stderr,"%d\t%d\n", line->rid ,line->pos);
break;
}
}
if ( sr->errnum ) ERROR("Error: %s\n", bcf_sr_strerror(sr->errnum));
bcf_sr_destroy(sr);
return EXIT_SUCCESS;
}
`
the problem is that the bcf_sr_next_line
is not reset for each call of bcf_sr_set_regions
, the break
statement is useless, the iterator continues for the first chromosome .
So , is there a method in htslib to reset a bcf_sr_ iterator before calling bcf_sr_set_regions
again ?
thanks, P
tagging: John Marshall
cross-posted: https://github.com/samtools/htslib/issues/1815