ElectroSphere: Electrical Electronics Engineering Bulletin
Characterization and Classification of AI Workloads in Modern Internet Data Centers
Abstract
Hari Prasad Sampatirao
Modern Internet Data Centers (IDCs) face a critical and previously unaddressed challenge in AI workload classification that threatens to undermine massive infrastructure investments. Despite explosive growth in AI applications— encompassing generative AI, deep learning training, and agentic AI—no comprehensive frameworks exist for intelligent AI workload classification in production IDC environments. This research gap represents a fundamental bottleneck in efficiently utilizing trillion-dollar AI infrastructure investments, as current binary classification systems fail when applied to complex, multi-stage AI computational pipelines.
This study introduces the first novel multi-level classification framework specifically designed to address AI workload heterogeneity through hierarchical machine learning. The framework employs an innovative three-tier architecture: rule-based foundational classification (Batch, AI, AI-long), unsupervised clustering for AI lifecycle phase identification (Training, Preprocessing, Inference, Tuning, Deployment), and supervised technology-specific categorization (Generative AI, Deep Learning, Traditional ML, Agentic AI, Non-AI). This hierarchical methodology represents the first systematic approach to capture AI workload complexity using advanced ensemble methods processing 47 engineered features across CPU, GPU, memory, disk, and RDMA utilization patterns.
Analysis of 23,871 production job instances from the Alibaba Cluster Trace (GPU v2025) dataset achieves 95.8% classification accuracy with XGBoost while maintaining robust crossvalidation performance (95.9% ± 0.59%). The framework reveals critical temporal and spatial patterns: 67% off-peak clustering for training operations, 78% business- hour correlation for inference workloads, and resource concentration where 20% of nodes handle 73% of highmemory AI operations.
Implementation delivers transformative operational improvements: 18% increase in resource utilization efficiency, 42% reduction in application latency, 12% decrease in energy consumption, and 57% reduction in SLA violations. The framework enables 28% improvement in capacity planning accuracy and reduces manual intervention requirements by 85% through automated workload identification.
These quantified benefits represent the first systematic solution to AI infrastructure optimization challenges, translating capital investments into measurable operational value for cloud service providers and enterprise IDC operators.

