Optimized Search-and-Compute Circuits and their Application to Query Evaluation on Encrypted Data

0
712
Optimized Search-and-Compute Circuits and Their Application to Query Evaluation on Encrypted Data

Optimized Search-and-Compute Circuits and their Application to Query Evaluation on Encrypted Data

Abstract

Optimized Search-and-Compute Circuits and their Application to Query Evaluation on Encrypted Data,Private query processing on encrypted databases allows users to obtain data from encrypted databases in such a way that the users’ sensitive data will be protected from exposure. Given an encrypted database, users typically submit queries similar to the following examples: 1) How many employees in an organization make over U.S. $100000? 2) What is the average age of factory workers suffering from leukemia? Answering the questions requires one to search and then compute over the relevant encrypted data sets in sequence. In this paper, we are interested in efficiently processing queries that require both operations to be performed on fully encrypted databases. One immediate solution is to use several special-purpose encryption schemes simultaneously; however, this approach is associated with a high computational cost for maintaining multiple encryption contexts. Another solution is to use a privacy homomorphic scheme. However, no secure solutions have been developed that satisfy the efficiency requirements. In this paper, we construct a unified framework to efficiently and privately process queries with search and compute operations. For this purpose, the first part of our work involves devising several underlying circuits as primitives for queries on encrypted data. Second, we apply two optimization techniques to improve the efficiency of these circuit primitives. One technique involves exploiting single-instruction-multiple-data (SIMD) techniques to accelerate the basic circuit operations. Unlike general SIMD approaches, our SIMD implementation can be applied even to a single basic operation. The other technique is to use a large integer ring (e.g., Z2t) as a message space rather than a binary field. Even for an integer of k bits with k > t, addition can be performed using degree 1 circuits with lazy carry operations. Finally, we present various experiments performed by varying the considered parameters, such as the query type and the number of tuples.
 

Introduction

PRIVACY homomorphism is an important concept for encrypting clear data while allowing one to perform operations on encrypted data without decryption. This concept was first introduced by Rivest et al. [1] and, much later, affirmed by Feigenbaum and Merritt’s question [2]: Is there an encryption function E(·) such that both E(x+y) and E(x·y) can be easily computed from E(x) and E(y)? Since that time, very little progress was made in determining whether such efficient and secure encryption schemes exist until 2009, when Gentry constructed such an encryption scheme [3]. Although the use of Gentry’s scheme and other fully homomorphic encryption (HE) schemes (e.g., [4]–[6]) theoretically allows for the secure evaluation of any function, the evaluation cost is still far frombeing practicalfor manyfunctions.Among the various important functions, we restrict our interest to a set of functions for databases, giving rise to the following question: Given a set of fully encrypted databases, can we construct a set of efficient functions to process queries over those encrypted databases? If so,is the computational cost of these functions acceptable? This question is the starting point of this work. To facilitate a better understanding of our approach, we would like to describe the motivation for our work from a different perspective. At present, the simplest way to search for records that satisfy a particular condition over encrypted databases is likely via searchable encryption .

However, privately processing sum and avg aggregation queries under the same conditions is performed using homomorphic encryption . Thus, the private processing of a query that includes both matching conditions and aggregate operations requires the use of two distinct encryption techniques in parallel, i.e., searchable encryption and homomorphic encryption. Recently, Ada Popa et al. proposed CryptDB and its extension , which can process general types of database queries using layers of different encryption schemes: deterministic encryption for equality condition queries, orderpreserving encryption (OPE) for range queries, and HE for aggregate queries. The disadvantage of their work is that in the long run, its privacy degrades to the lowest level of data privacy provided by the weakest encryption scheme. This observation leads to a natural question: Can we construct a solution to efficiently perform such a database query without maintaining multiple contexts of encryption? At first glance, HE schemes appear to perfectly satisfy the requirement of processing queries on encrypted databases within a single encryption context. However, no solutions exist for expressing and processing various queries on fully encrypted databases in an efficient manner.

Our main results can be summarized as follows:

• A general framework for private query evaluation: We provide a common platform to allow database users to work on a single underlying cryptosystem, to represent their queries as functions in a conceptually simple manner, and to efficiently perform these functions on fully encrypted databases.

• Optimization of circuits and their applications to compact expressions of queries: The foundation of our simple framework is a set of optimized circuits for the following operations: equality, greater-than comparison and integer addition. We call these operations circuit primitives. Our optimizations of circuit primitives are developed such that they minimize the circuit depth and the number of homomorphic operations. For this purpose, we make extensive use of single-instructionmultiple-data (SIMD) techniques to move data across plaintext slots. In general, SIMD technology allows basic operations to be performed on several data elements in parallel. In contrast, our proposal operates on packed ciphertexts of several data elements and thus enables the efficiency of the basic operations of the circuit primitives to be improved. Furthermore, we find that all circuit primitives have O(logμ) depth for μ-bit data. We then express more complicated queries using a combination of the optimized circuit primitives. The resulting query functions are conceptually simpler than other representations of database queries and are compact in the sense that retrieval queries requireat mostO(logμ) depth.

• Further enhancement of the performance of query evaluation: HE schemes typically use Z2 as a message space, meaning that their encryption algorithm encrypts each bit of a message. Although our circuit primitives function efficiently on bit encryptions, we can achieve further improvements by adopting a large integer ring (e.g., Z2t), particularly in the case of computing on encrypted numerical data. Even for an integer of k bits with k > t, addition can be performed using degree 1 circuits by processing lazy carry operations. Although this rectification requires modificationof our circuit primitives, we can preserve their optimality through SIMD operations. In other words, search-and-compute queries can be processed using only O(logμ)-depth circuits.

• Experimental support: We perform comprehensive experiments to evaluate the performance of various queries expressed using our techniques from both a theoretical and a practical perspective.

Compared with the preliminary version of this paper, this journal version now includes the following new results.

1) Additional circuit compositions to support more query functionalities. Consequently, we can show how to handle an SQL query with multiple conditions, such as equality or/and greater-than conditions.

2) Improvements in the experimental studies achieved throughthe use of carefully selected parameters. We first investigate how the length of the plaintext modulus affects the variation in depth resulting from homomorphic evaluations. We then demonstrate how to devise our circuit primitives based on this observation.

3) Performance results from various perspectives. The weakest element of our conference paper was the lack of heuristic evidence to support the usefulness of our techniques. In this work, we present various types of experiments on different sets of possible DB queries by not only combining fine-tuned circuits but also by varying the parameters, such as the number of plaintext slots and the bit length of the plaintext space.

Informal Description of Our Strategy Assuming a database consisting of N blocks, i.e., R1  R2  ···  RN, to encrypt the record Ri, a DB user prepares a pair of public/private keys (pk,sk) for an HE scheme and publishes the public key to a DB server. The users store their encrypted records ¯ Ri =Epk(Ri) for 1≤i≤N in the same manner as for normal write queries (e.g., using the insert−into statement). Suppose that a user wishes to submit a retrieval query Q to the DB server. Prior to being submitted, the query Q must be properly pre-processed such that all clear messages, such as constant values, are encrypted under the public key pk. We denote this transformed query by ¯ Q. Upon receiving ¯ Q, the DB server compiles it into ¯ Q∗ by applying our techniques. The reader can consider a dedicated module for performing this task.1 Hereafter, we call this module a Private Search-and-Compute (PSnC) processor. Next, the DB server homomorphically evaluates ¯ Q∗ over the fully encrypted databases and returns the resulting ciphertexts to the user. The DB user can decrypt the output using his private key sk while receiving no additional data except for the records that satisfy the where conditions. Figure 1 graphically illustrates the high-level architecture of our approach. Organization: The remainder of this paper is structured as follows. In Section II, we briefly review the BGV-type

 

Proposed System

  • In this system, we provide a common platform to allow database users to work on a single underlying cryptosystem, to represent their queries as functions in a conceptually simple manner, and to efficiently perform these functions on fully encrypted databases.
  • The foundation of our simple framework is a set of optimized circuits for the following operations: equality, greater-than comparison and integer addition. We call these operations circuit primitives.
  • Our optimizations of circuit primitives are developed such that they minimize the circuit depth and the number of homomorphic operations. For this purpose, we make extensive use of single-instruction multiple- data (SIMD) techniques to move data across plaintext slots. In general, SIMD technology allows basic operations to be performed on several data elements in parallel. In contrast, our proposal operates on packed ciphertexts of several data elements and thus enables the efficiency of the basic operations of the circuit primitives to be improved.