If you think that none of them have been develop don't hesitate to add an answer or a comment. I would accept the answer if it specifies the reason for the same.
Processing was never the bottleneck for Velvet (a 100M read assembly can take as little as 2 hours), so I don't think there would be much traction for that.
Memory is the killer, since that same assembly will most likely require a 256GB machine.
I am not sure if FPGAs have ever been used to accelerate the specific assembly algorithms that you mention, however, FPGAs have been used to speed up the Burrows Wheeler transform.
An FPGA-based parallel sorting architecture for the Burrows Wheeler transform
Burrows-Wheeler transform (BWT) has received special attention due to its effectiveness in lossless data compression algorithms. However, implementations of BWT-based algorithms have been limited due to the complexity of the suffix sorting process applied to the input string. Proposed solutions involve data structures combined with hardware architectures aimed at reducing computational complexity. However, advanced data structures are difficult to be implemented directly into hardware architectures as they require sophisticated control units. In this paper we present a novel architecture based on a parallel sorting block to implement the BWT transform. The proposed architecture has been implemented on a field programmable gate array (FPGA) device providing good performance improvements compared with other reported implementations on FPGAs. Results obtained show a reduction in the number of cycles and an increase in the maximum frequency compared with other works. FPGA implementation results are presented and discussed.
You can find the paper here if you have a subscription.
@LJJ : FPGA's have been used to accelerate BWT (I know that alright) but a modified form of BWT is used in the assembly algorithms called Burrows wheeler alignment method which uses the same base (suffix trees etc.) as described by BWT but has a good amount of modification to be able to be useful for genome assembly. I could speed that up by implementing on FPGA. Well, thanks for your response.
Convey has FPGA accelerated versions of both Velvet and BWA. Application throughput on a Convey server is from 10x and 15x compared with modern x86 servers. See http://www.conveycomputer.com/lifesciences/
If you think that none of them have been develop don't hesitate to add an answer or a comment. I would accept the answer if it specifies the reason for the same.