I am writing a library many people are likely to use. I do not want them to have to install any additional libraries or know where their libraries are installed.
Therefore I want to include a single header for reading bams. Are there any single file headers for reading bams I can include in my project?
Edit: I only need the chromosome, start, (end or length) and strand.
Not sure about the language you are using for your code but it is easy to use a range of Python or Java libraries in your program. If you are using Python then you could use pysam (it uses htslib) and if you are using Java then you could use htsjdk.
Pysam requires htslib. See pysam: How do I type an AlignedSegment in Cython?
Sorry, I updated my comment accordingly. Is there a specific reason why you are not planning to use htslib?
Compile errors like I showed in the linked to question.
pysam
can easily be installed viaconda
orpip
. There is no need for from-scratch compilation. Trying to write code for standard tasks like reading a BAM is IMHO not only unnecessary but a wrong investment of resources.htslib
(or its analoga in other languages) is an on-going project for years now developed from experts in the field. It contains features for quality and integrity control of the BAM files that you should exploit rather than dismiss. Do not reinvent the wheel. Use existing code and solutions and build your tool around it, focusing on the novelity of your tool.But pysam is slow as each record is a Python object which needs to be parsed.
what's wrong with my previous answer: C: Can I read chrom, strand, pos, len from bam files without htslib? ?
Every user who installs my software would have to know where their htslib is. And update their setup.py to reflect the location.
Your users would probably be using conda anyway, so that becomes a non-issue.
no, because, as it's a git submodule, the libraries would be under your main folder.
Oh please, tell me you know how to compile a C/C++ program with
make
.MACS2 seems to be able to read bam without any special utilities: https://github.com/taoliu/MACS/blob/33187eae605081c8ddad9313a886bd01d2c654cd/MACS2/IO/Parser.pyx#L732
However, I do not know if the code is brittle, or especially fast. It used the struct library so it cannot be compiled down to pure C/C++.
MACS2 likes to break in many places with exceptionally cryptic error messages. You need to just use htslib and get on with it.