Synthetic Data Generation for Supply Chain

Yunbo Long 409 words 3 minutes Synthetic Data Generative AI Benchmarking

Synthetic data generation has emerged as a foundational enabler of supply chain AI research, addressing three persistent challenges: data scarcity, confidentiality, and the difficulty of constructing reproducible benchmarks. Real-world supply chain data is typically proprietary, fragmented across trading partners, and subject to strict regulatory and contractual constraints—creating a significant barrier for academic research and the fair comparison of AI methods.

Modern synthetic data techniques—ranging from generative adversarial networks (GANs) and variational autoencoders to diffusion models, conditional generative models, and agent-based simulation—can produce artificial datasets that preserve the statistical distributions, temporal dynamics, and network topology of real supply chains. These synthetic datasets enable researchers to train and evaluate models for demand forecasting, delay prediction, supplier link prediction, and risk analysis without ever exposing sensitive operational data. They also support differential-privacy and secure-multiparty-computation pipelines, making them a natural complement to privacy-preserving learning paradigms such as federated learning.

Research from the Supply Chain AI Lab at the University of Cambridge has contributed to this field through the release of open synthetic supply chain datasets and benchmarks—including networks for link prediction, shipment records for delay forecasting, and simulated procurement environments for agent-based research. These efforts are part of a broader community movement towards open, reproducible, and privacy-respecting supply chain AI.

We invite you to explore the curated collection of key publications below, offering insights into the methods and applications of synthetic data generation for supply chains.

List of Publications

  1. Xu, L., Proselkov, Y., Brintrup, A. and Long, Y., 2024. Synthetic supply chain datasets for benchmarking AI methods. IFAC-PapersOnLine, 58(19), pp.807-812. [PDF]
  2. Kosasih, E.E. and Brintrup, A., 2022. A machine learning approach for predicting hidden links in supply chain with graph neural networks. International Journal of Production Research, 60(17), pp.5380-5393. [PDF]
  3. Aziz, A., Kosasih, E.E., Griffiths, R.-R. and Brintrup, A., 2021. Data considerations in graph representation learning for supply chain networks. ICML 2021 Workshop on Machine Learning for Data. [PDF]
  4. Brintrup, A., Wang, Y. and Tiwari, A., 2017. Supply networks as complex systems: A network-science-based characterization. IEEE Systems Journal, 11(4), pp.2170-2181. [PDF]