Dalloo, A., Humaidi, A. (2025). Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments. , 25(2), 16-32. doi: https://doi.org/10.33103/uot.ijccce.25.2.2
Ayad M. Dalloo; Amjad J, Humaidi. "Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments". , 25, 2, 2025, 16-32. doi: https://doi.org/10.33103/uot.ijccce.25.2.2
Dalloo, A., Humaidi, A. (2025). 'Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments', , 25(2), pp. 16-32. doi: https://doi.org/10.33103/uot.ijccce.25.2.2
Dalloo, A., Humaidi, A. Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments. , 2025; 25(2): 16-32. doi: https://doi.org/10.33103/uot.ijccce.25.2.2
Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments
IRAQI JOURNAL OF COMPUTERS, COMMUNICATIONS, CONTROL AND SYSTEMS ENGINEERING
1university of Technology/ Department of Communication Engineering
2Control and Systems Engineering Department, University of Technology, Baghdad 10001, Iraq
Abstract
The k-Nearest Neighbors (kNN) algorithm is widely used for classification due to its simplicity and effectiveness. However, its computational cost remains a significant challenge, particularly for embedded systems with limited processing power and memory. To address this issue, we propose the Group-Based Sample Partitioning (k²NN) Algorithm, which introduces a two-phase approach to reduce computational complexity while maintaining classification accuracy. In the first phase, the algorithm pre-groups training samples by iteratively selecting anchor points and partitioning their k-nearest neighbors, thereby reducing redundancy in the dataset. In the second phase, the test sample dynamically selects local anchor points, constructing a smaller, more relevant neighborhood for efficient classification. Experimental results using the Breast Cancer Dataset from Kaggle (KGBC) demonstrate that k²NN significantly reduces training and testing iterations while preserving high classification accuracy (95.78%), with a recall of 100%. Compared to exhaustive kNN, our approach achieves a substantial reduction in distance computations (21.79% of exhaustive kNN) without requiring additional storage. While tested on a relatively small dataset, k²NN shows promise for scalable implementation in embedded systems. Also, the proposed approach shows the computation cost reduction can reach 75.5% for larger datasets when we tested different datasets ranging from 100 to 30,000 samples. Future work will explore an extended kⁿNN framework, introducing multiple k-parameters for adaptive scaling to high-dimensional datasets while maintaining computational efficiency. https://github.com/AyadMDalloo/K2NN.