#required:
- oracle java8
- the htsjdk library
jjs -cp path/to//htsjdk/dist/htsjdk-2.0.1.jar biostar201139.js -- "/path/to/input.bam"
I have a sam file. I want to select the reads with the best MAPQ (mapping quality).
Firstly, for example, I can have three reads with the same name but these have different MAPQ value, therefore I want to exclude the two reads with less MAPQ value and select the read with the best MAPQ.
Second, if there are three reads with the same name and the same number of MAPQ I want to exclude these three reads.
Thanks
For the first one you can more simply exclude secondary alignments (bit 256 in the flag, see -f and -F).
For the second requirement, you'll either need to know the range of possible MAPQ values where this can occur for the aligner you're using and exclude them or write a script to do this (e.g., in python with pysam).
For fun, using nashorn (the java-based javascript engine) and htsjdk:
var SAMFileWriterFactory = Java.type('htsjdk.samtools.SAMFileWriterFactory'); | |
var inputFile= new java.io.File(arguments[0]); | |
var samReader = Java.type('htsjdk.samtools.SamReaderFactory').makeDefault().open(inputFile); | |
var samWriter = new SAMFileWriterFactory().makeSAMWriter(samReader.getFileHeader(),true,java.lang.System.out); | |
var iter = samReader.iterator(); | |
var array=[]; | |
for(;;) | |
{ | |
var rec = null; | |
if(iter.hasNext()) rec=iter.next(); | |
if(rec==null || (array.length>0 && array[0].getReadName()!=rec.getReadName())) | |
{ | |
if(array.length>0) { | |
var lowest_index =-1,n_lowest=0; | |
for(var i in array) { | |
var rec2= array[i]; | |
if(lowest_index ==-1 || rec2.getMappingQuality() < array[lowest_index].getMappingQuality() ) | |
{ | |
lowest_index=i; | |
n_lowest=1; | |
} | |
else if(rec2.getMappingQuality() == array[lowest_index].getMappingQuality() ) | |
{ | |
n_lowest++; | |
} | |
} | |
if( n_lowest!=-1 && n_lowest<3) | |
{ | |
samWriter.addAlignment(array[lowest_index]); | |
} | |
} | |
if(rec==null) break; | |
array=[]; | |
} | |
array.push(rec); | |
} | |
samReader.close(); | |
samWriter.close(); |
(not tested, It needs real data... )
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks Devon Can I do it in shell (awk, perl)? Are there any way to do it using samtools?
For the second requirement you can undoubtedly do that with perl. If the file is name sorted then you might be able to put something together with awk, but it'd be more trouble than it's worth. Stick to perl if that's what you know (or play with the code that Pierre posted if you're OK with javascript).