Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning

German Research Center for Artificial Intelligence
RPTU Kaiserslautern-Landau
International Joint Conference on Neural Networks – IJCNN, 2025

Abstract

Dataset distillation has gained significant interest in recent years, yet existing approaches typically distill from the entire dataset, potentially including non-beneficial samples. We introduce a novel “Prune First, Distill After” framework that systematically prunes datasets via loss-based sampling prior to distillation. By leveraging pruning before classical distillation techniques and generative priors, we create a representative core-set that leads to enhanced generalization for unseen architectures - a significant challenge of current distillation methods. More specifically, our proposed framework significantly boosts distilled quality, achieving up to a 5.2 percentage points accuracy increase even with substantial dataset pruning, i.e., removing 80% of the original dataset prior to distillation. Overall, our experimental results highlight the advantages of our easy-sample prioritization and cross-architecture robustness, paving the way for more effective and high-quality dataset distillation.

MY ALT TEXT

Our work introduces a 'Prune first, distill after' approach for dataset distillation with generative priors (GLaD and LD3M).

MY ALT TEXT

By simply sorting the images in your dataset via a loss-value score from your classifier, you can prune unbeneficial samples out before distillation and reach improved performance.

MY ALT TEXT

In our written work, we show that removing at least 40% of the hardest (high loss) before distillation leads to dramatic training quality across multiple architectures (AlexNet, VGG-11, ResNet-18, and ViT).

BibTeX

@article{moser2024distill,
  title={Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning},
  author={Moser, Brian B and Raue, Federico and Nauen, Tobias C and Frolov, Stanislav and Dengel, Andreas},
  journal={arXiv preprint arXiv:2411.12115},
  year={2024}
}