Reconfigurable processing architectures for stream processing and hybrid computing

Date of Completion

January 2011


Engineering, Computer|Engineering, Electronics and Electrical




High performance computing systems are often inhibited by the performance of their storage system and their ability to deliver data. In stream processing of data, Active Storage Networks (ASN) provide an opportunity to optimize storage system and computational performance by offloading some computation to the network switch. Data processing in a distributed system often requires the data to be aggregated at a single client before performing the data operation. An implementation of this processing in the interconnection network which has the global view of the data could speed up the application. An ASN is based around an intelligent network switch that allows data processing to occur on data as it flows through the storage area network from storage nodes to client nodes. We propose an approach to perform transformation and reduction data operations in an intelligent network switch comprised of FPGAs. A low cost non blocking 2-dilated flattened butterfly interconnection network is chosen for prototype implementation of ASN. Common data processing applications, namely data sort, data search, k-min/max and K-means clustering applications have been implemented on the switching elements of this network. The scalability of the ASN in performing data processing applications is evaluated by applying functional and data parallel techniques to the K-means clustering problem. The implementations show that the in-network processing in an ASN greatly improves performance.^ In the other part of our work, we focus on providing operating system support for dynamic reconfiguration of FPGA to provide support for task offloading to FPGA. Operating system support for HW/SW co-design is in its infancy and faces several challenges before it could provide achievable benefits. Some of the issues surrounding hybrid computing are resource management across heterogeneous multi-cores, data communication, recovery from errors etc. We have built a prototype reconfigurable system that can offload tasks from a processor to the reconfigurable core. We also developed several scheduling algorithms for resource allocation among HW and SW computing kernels and analyze the performance trade-offs of these algorithms. ^