Browsing by Subject "Asymptotics"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item Data Permutation Recovery from Noisy Data: Error Probability and Privacy(2024-05) Jeong, MinohThis doctoral thesis investigates the data permutation recovery problem from noisy observations, a fundamental challenge at the intersection of data science and privacy-preserving technologies. The main question is: Given a noisy observation of data, according to which permutation was the original data sorted? This thesis addresses both the exact and approximate permutation recovery problems under various noise conditions. It begins by formulating the permutation recovery problem within the statistical hypothesis testing framework, and characterizes the linear regime, where the true permutation can be optimally estimated by only a linear transformation of the observation and a sorting operation. In particular, under Gaussian data and noise, this thesis derives the necessary and sufficient conditions for the linear regime in terms of the noise covariance matrix. Subsequently, this thesis shifts the focus to the examination of the error probability associated with linear decoders in the presence of Gaussian noise, but arbitrary data distribution. This analysis reveals the noise-dominated nature of the permutation recovery problem, illustrating how the error probability scales in both low- and high-noise regimes, and providing insights into the behavior of the error probability under different noise and data distributions. Specifically, the error probability of the permutation recovery problem grows linearly in the noise standard deviation σ in the low-noise regime, implying the noise-dominated nature of the problem. Advancing into the realm of data privacy, the thesis explores the private rank- ing recovery problem, aiming to establish fundamental trade-offs between estimation accuracy and privacy. Leveraging differential privacy metrics, it evaluates the effectiveness of various privacy-preserving mechanisms while ensuring the fidelity of the ranking recovery. Regarding the generalized permutation recovery problem, this thesis also proposes an approximate version of it. It demonstrates that this approximate version leads to an error probability having sub-linear behavior in σ in the low-noise regime, which is a notable difference with respect to the exact recovery. This finding highlights the potential of approximate recovery in enhancing the robustness and efficiency of permutation recovery. In summary, the thesis contributes significantly to our understanding of permutation recovery in noisy and privacy-sensitive environments. The thesis not only advances theoretical foundations but also provides practical strategies for tackling the challenges inherent in recovering data permutations, thereby offering valuable perspectives for advancements in data processing and privacy-preserving techniques.