Entering edit mode
8.1 years ago
Hakan.Erol.HE
▴
10
Hello Biostars Community,
I am a newbie to the bioinformatics world and need some help with a project I am working on!
I need to use htsjdk to extract every tlen value from a BAM file and have a few questions.
- How do a read the BAM file in the first place? I have found SamReader that might do the job like so : SamReader samReader= SamReaderFactory.makeDefault().open(new File("mybam.bam"));
- How do I find the position of individual chromosomes? I want to multithread the application, reading each chromosome in parallel.
- Once I get the position to start reading from, how do I read each line, then the tlen field?
Sorry for all of the questions, I have been googling a lot and haven't found answers!
Any help is appreciated, thanks in advance!
Ah thank you so much! getInferredInsertSize is what I was looking for, much appreciated!
Might you know how to start iterating from the start of a particular chromosome?
Hi- See edit. Have a look at the htsjdk docs for the meaning of the various method. Also make sure
ValidationStringency.SILENT
is ok for you.Much appreciated. One more question, is there a function that returns the list of chromosomes/contigs so I can dynamically build that list?
Hi again - See edit, I haven't tested it but it should be about right.
The contig names do indeed work thank you. However, I am running into an issue with reading from a specific one using the query method.
I get this error: htsjdk.samtools.SAMException: No index is available for this BAM file. How do I provide the index file?
You can make the index with e.g. command line samtools
samtools index aln.bam
, the bam file needs to be sorted by position in order to be indexed (samtools sort aln.bam > sorted.bam
). You can also sort and index within Java/htsdjk (I would have to look this up though.)l'm using SamReader query method, but it doesn't work. Nothing print. l was wondering if you have this problem. My code:
hi, l use SamReader query method, it doesn't work, l don't know why. My htsjdk version is 2.8.1.
Are you sure the interval chr3:179198825-179199177 contains some reads? Make sure also that "chr3" is correct, maybe your chromosome is named "3". What is the output of
samtools view in.bam chr3:179198825-179199177
?l am pretty sure, 'chr3' is right and the interval is also right. 'samtools view in.bam chr3:179198825-179199177' could print the results.