Fuzzy Vault

Fuzzy Vault is an encryption scheme proposed by Juels and Sudan [1] which leverages some of the concepts of error-correcting codes, to encode information in such a way as to be difficult to obtain without the "key" used to encode it, even if the methods used for encoding are publicly known. Although this approach can work with any code, the Fuzzy Vault scheme is often used with Reed-Solomon codes, so we will focus on them for this exposition. A secret is encoded using a set of values (the "key"), and can then be unlocked with another set of values only if it has substantial overlap with the set used to lock it. This approach offers order invariance, meaning that the sets used to lock and unlock the secret are unordered. Because of this property, the Fuzzy Vault scheme has been seen application in biometric encryption [2,3], namely fingerprint authentication.

Table of Contents

  1. Overview
  2. Problem Statement
  3. LOCK Algorithm
  4. UNLOCK Algorithm
  5. Proof of $ \frac{t-k}{2} $-Completeness
  6. Proof of Soundness
  7. Fuzzy Vault in Biometrics
  8. See Also
  9. References

Reed-Solomon codes work by encoding the values in a message as the coefficients of a polynomial and then evaluating that polynomial at a set of points to obtain the codeword for that message. By using a number of evaluation points greater than the degree of the polynomial, it will be possible to obtain the polynomial (and therefore the message) by interpolation even in the presence of missing or erroneous values. However, if there are too many errors, there will not be a unique interpolating polynomial of the proper degree. These properties are leveraged in the Fuzzy Vault scheme.

Suppose that an entity Alice wishes to lock a secret $ S $ such that it will be secure even if an adversary has access to the locked secret. Alice has a set $ f $ with which to encode the secret, whose values are not know to the adversary. Alice can encode $ S $ in a polynomial $ p(x) $, and evaluate $ p $ at the values in the set $ f $. Thus, the values in the projected set $ p(f) $ lie on the polynomial. Alice then combines points of random noise, or chaff points, which do not lie on $ p(x) $, with the set of points in $ p(f) $ into an unordered set $ R $. The set $ f $ represents the points of $ R $ which lie on $ p $, and provided the cardinality of $ f $ is large enough, $ p $ (and therefore $ S $) can be determined from $ f $ by interpolation.

Now suppose that another entity Bob has a set $ f' $ of values, and wishes to unlock $ S $. If $ f' \cap f $ is large, then $ f' $ will contain many values whose corresponding points in $ R $ lie on $ p $. Even if $ f' $ does not agree with $ f $ everywhere, if they are sufficiently close, Bob can leverage the error correcting properties of Reed-Solomon codes to correctly recover $ S $. On the other hand, if $ f' $ and $ f $ have few values in common, then the points in $ R $ identified by $ f' $ will mostly be chaff points, and he will not be able to recover $ S $ through interpolation because many polynomials of the proper degree will be equally likely.

In the context of fingerprint authentication, fingerprint features, or minutiae, are detected and their locations and orientations are quantized. The set of all possible tuples of quantized location and orientation forms the alphabet of which a particular user's minutiae are a subset. However, the many variables involved in collecting fingerprint features (finger orientation, pressure on the sensor, etc.) make retrieving the exact same readings difficult, even from the same user. This is part of the motivation for using the Fuzzy Vault approach; if a significant portion of the minutiae can be retrieved from the user's fingerprint at each use, the error-correcting properties of Reed-Solomon codes and the order invariance property will enable the user to be correctly recognized even if their fingerprint is measured slightly differently each time.

This intuition for the Fuzzy Vault scheme motivates the following formalized problem statement.

Problem Statement
We have a secret $ S $, a $ k $-length vector of values from the finite field of size $ q $, $ \mathbb{F}_q^k $, and a set $ f \in {\mathbb{F}_q \choose t} $ such that $ f $ contains $ t $ distinct values from $ \mathbb{F}_q $. We desire a function LOCK: $ \mathbb{F}_q^k \times {\mathbb{F}_{q^2} \choose t} \rightarrow {\mathbb{F}_q \choose n} $ where $ t \le n $. Thus LOCK$ (S,f) \subseteq \mathbb{F}_q $ of size $ n $. We also want a function UNLOCK: $ {\mathbb{F}_q \choose t} \times {\mathbb{F}_{q^2} \choose n} \rightarrow \mathbb{F}_q^k $.

We want our LOCK and UNLOCK algorithms to have the following properties:
$ C $-completeness
Def: Let $ f,f' \in {\mathbb{F}_q \choose t} $ such that $ \mid f - f' \mid \le c < t $. Then UNLOCK$ (f', $LOCK$ (S,f)) = S $. This represents the number of errors that the scheme can tolerate; we prove later that the Fuzzy Vault scheme is $ \frac{t-k}{2} $-complete.
For any adversary with limited knowledge of $ f $, with high probability there will be exponentially many valid possibilities for $ S $ based on the adversary's $ f' $. This represents that if the set $ f' $ does not overlap with the correct $ f $ enough, the adversary will not be able to retrieve the secret $ S $.

LOCK Algorithm
The idea is to encode the secret $ S $ in polynomial $ p $, add $ t $ points from key set $ f $ to $ R $ as tuples with their projection onto the polynomial, then fill the remaining $ n-t $ points in $ R $ with points that are not on $ p $.

LOCK$ (S,f=\lbrace \alpha_1, \dots,\alpha_t \rbrace) $

  1. Initialize $ R $ and $ T $ to be $ \emptyset $
  2. Encode $ S = [s_0,s_1,\dots,s_{t-1}] $ in coefficients of polynomial $ p(x) = s_0 + s_1 x + \dots + s_{t-1} x^{t-1} $
  3. for $ i=1,\dots,t $
    $ T \leftarrow T \cup \lbrace \alpha_i \rbrace $
    $ R \leftarrow R \cup \lbrace (\alpha_i, p(\alpha_i)) \rbrace $
  4. for $ i = t+1,\dots,n $
    Pick $ \alpha_i $ randomly from $ \mathbb{F}_q \setminus T $
    Pick $ y_i $ randomly from $ \mathbb{F}_q \setminus \cup_{j=1}^i p(\alpha_j) $
    $ R \leftarrow R \cup \lbrace (\alpha_i, y_i) \rbrace $
  5. Output $ R $ after randomly permuting it

UNLOCK Algorithm
The idea is to take a locked secret, represented as a set of points $ R = \lbrace (z_i, y_i) \rbrace $ (the output of LOCK), and a set $ f ' = \lbrace \beta_1,\dots,\beta_t \rbrace $ of evaluation points. For each $ \beta_i $, retrieve the corresponding point from $ R $ (where $ \beta_i = z_i $); decode this set of points using the $ [t,k]_q $ Reed-Solomon Code. If there are fewer than $ \frac{n-k}{2} $ errors, the secret $ S $ can be correctly decoded; if not, then the Reed-Solomon decoding algorithm outputs NULL.

UNLOCK$ (f', R= \lbrace (z_i, y_i) \rbrace) $

  1. Initialize $ Q $ to be $ \emptyset $
  2. for $ i = 1,\dots,t $
    Find the $ j $ such that $ z_j = \beta_i $; set $ w_i \leftarrow y_j $
    $ Q \leftarrow Q \cup \lbrace (\beta_i, w_i) \rbrace $
  3. Output $ S' \leftarrow $ Decode Q with Reed-Solomon Decoding Algorithm

Proof that the Fuzzy Vault Scheme is $ \frac{t-k}{2} $-complete
We will show that the Fuzzy Vault Scheme is tolerant of up to $ \frac{t-k}{2} $ errors.

Proof: Given $ f' $ such that $ \mid f - f' \mid \le \frac{t-k}{2} $, then $ \mid f \cap f' \mid \ge \frac{t+k}{2} $. Also, assume $ R = \lbrace (x_i,y_i) \rbrace_{i=1}^n \leftarrow $ LOCK$ (s,f) $.
If $ \beta_i \in f \cap f' $, then $ \beta_i $ matches some $ \alpha_j \in f $ and $ z_i = p(\alpha_j) = p(\beta_i) $.
For at least $ \frac{t+k}{2} $ of $ \beta_i \in f' $, we know $ p(\beta_i) $.
Since we have less than $ \frac{t-k}{2} $ errors, the Reed-Solomon decoding algorithm will correctly output $ p $ (and therefore $ S $).

See Also
Reed-Solomon Error Correction
Fingerprint Recognition
Fuzzy Extractor

[1] A Fuzzy Vault Scheme - A. Juels and M. Sudan
[2] Fuzzy Vault for Fingerprints - U. Uludag et al.
[3] Fingerprint-based fuzzy vault: Implementation and performance - K. Nandakumar et al.