Pipe the output to /dev/null and the overall statistics (# reads mapped) will show up on stderr. You can build your own indices from whatever fasta files you make (promoters-only, etc.).
If your reference is promoter only, you can write a simplistic hash table based mapper by hashing every k-mer in your small reference. Depending on your read lengths and genome sizes, this may be the fastest solution.
Eland version 1. It reports the number of exact hits. It does not work with reads longer than 32bp, though.
BWA. It gives you the number of best hits at the X0 tag. One caveat is that contigs are concatenated as a single sequences. You may need to add, say, 1000A to the end of your contigs if they are too small. BWA will not be very efficient, but probably it does not matter if you do not have a huge data set.
BWA fastmap. It will be faster than BWA as it only considers the partial exact matches, but still the speed is not optimal as what you want is full-length exact match only. You still need to add long A to avoid cross-contig matches.
Fermi exact. With BWA and fastmap, you need to re-index the reference once you change it. "Fermi exact" index your reads first and then map the reference sequence against the read index. If your reference genome is frequently changing, fermi exact may be more convenient. It is also possible to index the genome with fermi. You won't have the concatenated contig problem, but it will be much slower than BWA fastmap.
Bowtie default. As I remember, bowtie by default gives you the count on one strand of the reference at least. I forget whether it gives the count for both strands. Bowtie also has a similar problem to bwa: contigs are internally concatenated. Naively, I think there is no fast solution. Probably its count is also inaccurate occasionally.
Bowtie -a. This asks bowtie to output all hits. It is not recommended unless you are working on small data sets. For all FM-index aligners, reporting the positions of all hits make them much slower. Also, as I remember, an early version of bowtie reported fewer hits than Eland and bwa (these two agreed). I have not done similar experiment with more recent versions.
My recommendation is 4.
ADD COMMENT
• link
updated 5.1 years ago by
Ram
44k
•
written 11.8 years ago by
lh3
33k
1
Entering edit mode
+1 Very thorough response. I would add that Vmatch can be used for exact matches and might be a fast, easy alternative to #1 above. I don't know how it compares to the other tools for alignment, but for custom matching tasks, it is a great tool. I'd avoid indexing the reads with Vmatch though, that would create very large files and would not be the most efficient approach.
Pipe the output to /dev/null and the overall statistics (# reads mapped) will show up on stderr. You can build your own indices from whatever fasta files you make (promoters-only, etc.).
Thanks for teaching me about dev/null! Just what I needed!