Membership inference attacks in generative AI

Vatsal Shah

Intro

Membership inference attacks pose a risk to the privacy of machine learning models by attempting to deduce whether a data instance was included in the model's training set. By analyzing outputs and confidence scores for a target example, especially outliers, an adversary can determine if a data point was likely used during training. This threatens the privacy of sensitive applications, like generative models, where identification of individuals' participation could violate their privacy.

To mitigate such risks, federated learning is a technique where models are trained on decentralized data that remains locally on user devices. This approach allows useful machine learning to be built without requiring personal data to be centralized or shared.

Federated learning methods and infrastructure are making generative AI accessible and responsible.

This technical blog post will provide an overview of how membership inference attacks against generative models work and discuss how federated learning defends privacy through its distributed approach. We explore remaining challenges, opportunities around optimization, and the need for standards and responsible practices. Overall, federated learning shows potential for developing generative AI if implemented ethically and with comprehensive protections.

The Privacy Threat: Membership Inference Attacks

Membership inference attacks aim to deduce whether a target data instance x was included in the training set D of a machine learning model f. By analyzing f's outputs and confidence scores for new examples, especially outliers, an adversary can make probabilistic inferences about x's membership in D. This violates the privacy of sensitive applications where individuals' participation in model training could expose them to harm.

How They Work Against Generative Models

Generative models learn representations of training data to synthesize new samples. Researchers have shown attackers can generate many samples from these models and analyze them to deduce details about the private training data.

The probability that a target was included can be expressed as:

P(target ∈ D|g, G, x) = P(g(x;θg) is implausible in G| target ∉ D)× P(target ∉ D)  / P(g(x;θg) is implausible in G) [1]

Where g generates samples G, trained on D. If g(x;θg) seems anomalous in G, that suggests the target x was not in D, so it likely was. Attackers manipulate g(z;θg) by varying noise z or hyperparameters θg when generating for x. If outputs remain implausible, that signifies x was unlike all the data g has learned to represent, as it was in D.

For example, a facial model trained only on humans may represent animals implausibly. But if a human face elicits similar outputs, it likely reflects private data the model lacked to learn. Unrealistic aggregates provide stronger evidence than isolated instances due to limitations in data and representations. Adversaries query diverse models and targets to reduce false inferences from insufficient data or sampling. Without direct access or inversion, certainty remains unlikely. Implausibility metrics help adversaries systematically gauge if a target reflects unseen data, but definitions vary per application and adversary goals.

Generative AI enables customized data and services but also poses risks to privacy and trust if misused.

Examples of Vulnerable Models: Attacking MLaaS Platforms

Membership inference also threatens machine learning as a service (MLaaS) platforms where models are trained on pooled client data. As Choquette-Choo et al. show, a facial GAN on an MLaaS platform leaked private details about D through generated samples. [2]  Their attack accuracy reached over 90% in identifying members of D, demonstrating serious privacy risks with pooled training. Machine learning as a service platforms also risk this form of privacy leakage, as private training data from many clients is pooled to build single virtual models.

Sensitive Use Cases at Risk

Sensitive domains like healthcare, finance, and education face serious privacy risks if membership inference compromises their machine learning models.

Healthcare organizations using generative ML for applications like medical imaging analysis or diagnosis risk patient re-identification if models leak membership details. A neural network trained on chest X-ray data could be vulnerable to membership inference, exposing patients' conditions.

Financial firms applying generative AI to applications such as fraud detection also face risks, as malicious actors could determine that high-value account details were likely part of the training data. A  model trained to detect illegitimate transactions could leak private customer data through membership inference.

The Federated Learning Approach

Federated learning enables the development of machine learning models without requiring the centralized aggregation of sensitive data. The approach trains models on decentralized data that remains local to each user or device, with only model updates shared—not raw data. This provides privacy benefits over pooled data while allowing useful global models to be built. However, responsible development demands systematically addressing risks around access and use at each node. Success depends on governance and safeguards—not technique alone.

Decentralized Training on Local Data

In federated learning, training data remains in decentralized silos, with users updating local model replicas that submit compressed updates to the central server. Updates are aggregated to build a shared global model reflecting patterns across the decentralized network data. Sensitive details stay protected locally without the risk of contributing to a central pool, but data is still used for local training and updating, presenting some residual risks around use and access.

For example, hospitals could train local X-ray anomaly detection models on private records, sending updates to build a global model for diagnosing patients at any facility. Updates rather than raw data are shared, mitigating privacy risks to patients—but depending on policies and access controls around local training data. Success requires systematically addressing the risks of use and management at each node.

Privacy Benefits of Local Data

Federated learning aims for useful global models built from decentralized data through sharing model updates rather than raw data. By avoiding a central pool of sensitive details, threats like membership inference or re-identification from aggregated data are mitigated.

Conclusion

In summary, membership inference poses serious risks to privacy with machine learning if not addressed, as adversaries can exploit models to deduce whether a target's data was likely used in training. Sensitive domains face disproportionate threats that demand solutions that balance accuracy and privacy. Federated learning shows promise by training models on decentralized data through sharing updates rather than raw details. While federated learning offers mechanisms to share insights from dispersed data, its success depends on rigor and cooperation, recognizing privacy as a matter of equity, not an obstacle.

References: 

[1] Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov: “Membership Inference Attacks against Machine Learning Models”, 2016
[2] Christopher A. Choquette-Choo, Florian Tramer, Nicholas Carlini, Nicolas Papernot: “Label-Only Membership Inference Attacks”, 2020
[3] Breugel, B. V., Sun, H., Qian, Z., & der Schaar, M. V. (2023, February 24). Membership Inference Attacks against Synthetic Data through Overfitting Detection. arXiv.org. https://arxiv.org/abs/2302.12580v1
[4] K. S. Liu, C. Xiao, B. Li and J. Gao, "Performing Co-membership Attacks Against Deep Generative Models," 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 2019, pp. 459-467, doi: 10.1109/ICDM.2019.00056
[5] C. Park, Y. Kim, J. -G. Park, D. Hong and C. Seo, "Evaluating Differentially Private Generative Adversarial Networks Over Membership Inference Attack," in IEEE Access, vol. 9, pp. 167412-167425, 2021, doi: 10.1109/ACCESS.2021.3137278

About Scalytics

Modern AI demands more than legacy data systems can deliver. Data silos, scalability bottlenecks, and outdated infrastructure hold organizations back, limiting the speed and potential of artificial intelligence initiatives.

Scalytics Connect is a next-generation Federated Learning Framework built for enterprises. It bridges the gap between decentralized data and scalable AI, enabling seamless integration across diverse sources while prioritizing compliance, data privacy, and transparency.

Our mission is to empower developers and decision-makers with a framework that removes the barriers of traditional infrastructure. With Scalytics Connect, you can build scalable, explainable AI systems that keep your organization ahead of the curve. Break free from limitations and unlock the full potential of your AI projects.

Apache Wayang: The Leading Java-Based Federated Learning Framework
Scalytics is powered by Apache Wayang, and we're proud to support the project. You can check out their public GitHub repo right here. If you're enjoying our software, show your love and support - a star ⭐ would mean a lot!

If you need professional support from our team of industry leading experts, you can always reach out to us via Slack or Email.
back to all articlesFollow us on Google News
Unlock Faster ML & AI
Free White Papers. Learn how Scalytics streamlines data pipelines, empowering businesses to achieve rapid AI success.

Ready to become an AI-driven leader?

Launch your data + AI transformation.

Thank you! Our team will get in touch soon.
Oops! Something went wrong while submitting the form.