blob: 797de50a05e6177b38ea0400b4a32504a093b53a (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
---
title: "Private Contact Discovery"
date: 2019-06-22T10:30:00+08:00
---
Private Contact Discovery
=========================
Private Contact Discovery (PCD) is the formal name for the problem modern smartphone messenger applications have on
installation: Given a user's address book, find out which of their contacts also use the same messenger without the
messenger's servers learning anything about the user's address book. The widespread non-private way to do this is to
simply upload the user's address book to the app's operator's servers and do an SQL JOIN keyed on the phone number field
against the database of registered users. People have tried sprinkling some hashes over these phone numbers in an
attempt to improve privacy, but obviously running a brute-force preimage attack given a domain of maybe a few billion
valid inputs is not cryptographically hard.
Private Contact Discovery can be phrased in terms of Private Set Intersection (PSI), the cryptographic problem of having
two parties holding one set each find the intersection of their sets without disclosing any other information. PSI has
been an active field of research for a while and already yielded useful results for some use cases. Alas, none of those
results were truly practical yet for usage in PCD in a typical messenger application. They would require too much CPU
time or too much data to be transferred.
At USENIX Security 2019, Researchers from technical universities Graz and Darmstadt published a paper titled *Private
Contact Discovery at Scale*
(`eprint <https://eprint.iacr.org/2019/517>`__ | `PDF <https://eprint.iacr.org/2019/517.pdf>`__).
In this paper, they basically optimize the hell out of existing cryptographic solutions to private contact discovery,
jumping from a still-impractical state of the art right to practicality. Their scheme allows a client with 1k contacts
to run PCD against a server with 1B contacts in about 3s on a phone. The main disadvantage of their scheme is that it
requires the client to in advance download a compressed database of all users, that clocks in at about 1GB for 1B users.
I found this paper very interesting for its immediate practical applicability. As an excuse to dig into the topic some
more, I gave a short presentation at my university lab's research seminar on this paper
(slides: `PDF <mori_semi_psi_talk.pdf>`__ | `ODP <mori_semi_psi_talk.odp>`__).
Even if you're not working on secure communication systems on a day-to-day basis this paper might interest you. If
you're working with social account information of any kind I can highly recommend giving it a look. Not only might your
users benefit from improved privacy, but your company might be able to avoid a bunch of data protection and
accountability issues by simply not producing as much sensitive data in the first place.
|