Review of Marginal Release

This part is a summary of the paper:

’14 SIGMOD PriView: Practical Differentially Private Release of Marginal Contingency Tables

1. A Brief Introduce of Marginal Table

For a database table, a full contingency table is a distribution table that contains all the attributes combinations. A marginal table is a distribution table that contains part of attributes combinations. And a marginal table can be calculated from a full contingency table by summing up all the corresponding attributes. The parameter k of the k-way marginal table depends on the number of the attributes combinations.


The picture above is from the talk given by Zhikun Zhang of ’18 CCS.https://www.youtube.com/watch?v=2yG4eIdOev0

2. Problem definition

For a dataset D that contains d binary attributes and a positive integer k(k<d), we want to reconstrute D’, a synopsis of D, through differential privately mechanism. If one wants to analyze the  relationships between k attributes in D, he can construct the k-way marginal table from D’.

The method should satisfy these requirements:

  • Privacy goal. The mechanism should meet differential privacy.
  • Utility goal. The constructed k-way marginal table should be close to the original data.
  • Efficiency requirement. Low time and space complexity.
3. Existing approaches and their limitation
The Flat Method

The Flat Method is a basic approach that one utilizes the Laplace mechanism to add noise to the full contingency table of the dataset. Then he can construct a new noisy version full contingency table according to the original full contingency table. This approach is suitable for the situation when d is small. When d increases, the time complexity and the space complexity increase, the accuracy of the constructed full contingency table decreases.

The Direct Method

The Direct Method is that one utilizes the Laplace mechanism to add noise to the k-way marginal table. Then he can construct a noisy version k-way marginal table. This approach is suitable for the situation when k is small. The time complexity and the space complexity increase slower than the Flat Method when d increases.

Adding Noise in the Fourier Domain

This approach is that one adds noise to the Fourier domain of a full contingency table. The advantage of this approach is that there is a one-to-one correspondence between full contingency tables and the Fourier coefficients. However, it is unfeasible to compute a full contingency table when d is large by utilizing linear programming.

Data Cubes

This approach is that one constructs a set of marginal tables which cover all query marginal tables and then tries to make them consistent. But this approach caused large time complexity.

The Matrix Mechanism

This approach is that one utilizes the Laplace mechanism to add noise to the alternative set of queries according to the given workload. The main limitation of this approach is to find the best possible strategy queries in low time complexity.

 


This part a the summary of the paper:
CALM: Consistent Adaptive Local Marginal for Marginal Release under Local Differential Privacy

1.Marginal Table

For a dataset which has d attributes \(\mathbb{A}=\{a_1, a_2, \cdots, a_d\}\), each attribute \(a_i\) has \(c_i\) values and each user has one vaule for each attribute.

  • \(M_A\): k-way marginal. Given a set of attributes \(A\), \(A\subseteq\mathbb{A}\),we denote \(V_A=\{\langle V_1, V_2, \cdots, v_d\rangle:V_i\in[c_i] \text{ if } a_i\in A, \text{ otherwise } v_i=*\}\)

Leave a Reply

Your email address will not be published. Required fields are marked *