Excited to announce that our paper, IVE, has been accepted at HPCA 2026!
Understanding and optimizing emerging applications is a recurring theme in my research group. Recently, we dove into single-server, Homomorphic Encryption (HE)-based Private Information Retrieval (PIR) to find its “killer application.” While my tech-savvy students quickly grasped complex schemes like OnionPIR, the math was initially arcane to me. This led to fierce internal discussions where I played the “devil's advocate”—a process that ultimately strengthened our work.
We identified that while batching requests is essential to amortize the cost of database scans in PIR, current CPUs and GPUs struggle to unleash the full potential of this approach due to compute and memory bottlenecks.
To address this, we propose IVE (an accelerator for batched PIR leveraging Versatile processing Elements). IVE features:
I look forward to seeing many of you in Sydney next February!
IVE: An Accelerator for Single-Server Private Information Retrieval Using Versatile Processing Elements
Sangpyo Kim, Hyesung Ji, Jongmin Kim, Wonseok Choi, Jaiyoung Park, and Jung Ho Ahn
Private information retrieval (PIR) is an essential cryptographic protocol for privacy-preserving applications, enabling a client to retrieve a record from a server's database without revealing which record was requested. Single-server PIR based on homomorphic encryption has particularly gained immense attention for its ease of deployment and reduced trust assumptions. However, single-server PIR remains impractical due to its high computational and memory bandwidth demands. Specifically, reading the entirety of large databases from storage, such as SSDs, severely limits its performance. To address this, we propose IVE, an accelerator for single-server PIR with a systematic extension that enables practical retrieval from large databases using DRAM. Recent advances in DRAM capacity allow PIR for large databases to be served entirely from DRAM, removing its dependence on storage bandwidth. Although the memory bandwidth bottleneck still remains, multi-client batching effectively amortizes database access costs across concurrent requests to improve throughput. However, client-specific data remains a bottleneck, whose bandwidth requirements ultimately limits performance. IVE overcomes this by employing a large on-chip scratchpad with an operation scheduling algorithm that maximizes data reuse, further boosting throughput. Additionally, we introduce sysNTTU, a versatile functional unit that enhances area efficiency without sacrificing performance. We also propose a heterogeneous memory system architecture, which enables a linear scaling of database sizes without a throughput degradation. Consequently, IVE achieves up to 1,275× higher throughput compared to prior PIR hardware solutions.