Delivery Delay Prediction

434 words 3 minutes

In this challenge, participants are invited to develop predictive models to classify deliveries as early, on-time, or delayed. A curated dataset will be provided, enabling teams to train, validate, and test their models on realistic supply chain scenarios.


Dataset

The dataset includes structured records with features relevant to delivery timing. Participants will use this data to build and refine models capable of accurately predicting delivery outcomes.

The dataset will be divided into two parts:

  • 80% for model training and validation, which will released to participants in due time.
  • 20% reserved as the marking dataset for final scoring

Note: Participating teams will be provided with 80% of the dataset (already released) for training and development.
The remaining 20%-unseen by participants—will be withheld for final evaluation and scoring.

Sample Dataset

👉 Click the link to access the sample dataset, representing approximately 10% of the final dataset.


Final Dataset for Competition

Important: Access to the dataset requires first to register this challenge.

👉 Click the link to access the FINAL dataset for competition.


Evaluation and Scoring

Submitted models will be evaluated using the Macro F1 Score (see below code snippet for evaluation), which considers both precision and recall across the three delivery categories: early, on-time, and delayed.

from sklearn.metrics import f1_score

# Assume y_true and y_pred are the actual and predicted labels, respetively.
f1_score(y_true, y_pred, average='macro')

Final rankings will be determined based on performance on a marking dataset (20% of the full data), which will be held back and not available during model development. This ensures a fair and unbiased assessment of each model’s generalisation ability.


Submission

Please compile all your solution files into a single .zip archive and submit it via email to scl.data.challenge@gmail.com. Make sure to include the word "Delay" in the subject line. The submission deadline is 5:00 PM (GMT+2) on 2 July 2025.

All submitted code must be reproducible to ensure the validity of reported scores. The use of any external data not provided as part of the challenge is not permitted.


Awards and Certificate

Two awards will be presented:

  • 🏆 Best Performance Award: Awarded to the team or individual whose model achieves the highest Macro F1 score on the marking dataset.
  • 🏆 Award for Innovation: Recognises a model that demonstrates exceptional creativity and insight in its approach.
  • 🎓 Certificate of Participation: Participants can choose to opt in during registration to receive a Certificate of Participation.

Join us in solving a real-world challenge faced by supply chain professionals and showcase your expertise in applied AI and predictive modeling.