In order to develop the models for predicting uniaxial compressive strength, an experimental database is created by subjecting the limestone samples to laboratory tests. In the first phase, UCS is determined in accordance with ISRM suggested methods. Subsequently, the samples are also subjected for ball mill grinding tests to generate grinding characteristics of ball mill.
Uniaxial compressive strength
For the laboratory determination of UCS, the limestone samples are collected from the mines located in different parts of Southern India. The limestone samples examined in this study primarily consist of calcium carbonate (CaCO₃) in the form of calcite, with varying proportions of accessory minerals such as quartz, feldspar, clay minerals, pyrite, and siderite. These mineralogical variations contribute to variations in microstructure which reflects the changes in strength properties. Additionally, the textural diversity of limestone is significant, ranging from fine-grained formations to coarsely crystalline structures, reflecting diverse depositional environments and subsequent diagenetic processes.
The collected samples were prepared and tested in the laboratory to determine compressive strength as per ISRM suggested methods (2007). In this study, core samples with standard NX size were tested to determine UCS of 82 samples. These samples have a diameter of 54 mm, with a length-to-diameter ratio of 2.5. The UCS of the prepared rock samples is determined by centrally aligning on the loading platen and a constant loading rate was applied while recording the applied load (P) until failure occurred. A view of laboratory set up for determination of UCS is shown in Fig. 2. The corresponding UCS values were then determined using the load at failure (P) and cross-sectional area (A) dimensions and which is given by Eq. (12):
$$\:\text{U}\text{C}\text{S}=\:\frac{\text{P}}{\text{A}}$$
(12)

Laboratory determination of uniaxial compressive strength (a) Laboratory set up (b) Illustration of compressive strength test.
The laboratory tested samples for uniaxial compressive strength are presented in Table 1 as descriptive statistics.
Grinding tests
The grinding test on limestone samples was performed using a conventional laboratory-scale ball mill with a total volume of 0.0865 m3. The mill operates at a speed of 55 rpm, which corresponds to 70% of its critical speed. The samples were first hammered to a size of approximately 50–60 mm. The crushed material is then sieved to obtain a size range of − 10 + 6.3 mm. The resulting sieved material serves as the feed input to the ball mill. An adequate amount of grinding medium (High Carbon Chrome Steel balls) is added to the ball mill drum to facilitate the grinding process. For the dry grinding experiments, the test sample’s volume is selected such that the combined volume of the sample and grinding media is less than 40% of the total mill volume. The selection of operating parameters for ball mill grinding necessitates a systematic and iterative approach to achieve an optimal balance among grinding performance, product quality, energy efficiency, and equipment durability36,37. Ball milling is governed by multiple parameters that significantly influence particle size reduction and grinding efficiency. Identifying the most impactful parameters is critical for achieving desired outcomes. Key operating parameters in industrial tumbling mills include mill speed, feed size, ball size distribution, and grinding duration, which are selected based on ore properties (e.g., hardness, density, strength) and operational constraints, such as mill capacity and grinding media type. Secondary parameters, such as the mill volume fraction occupied by the ore or sample charge, mill volume fraction occupied by the ball charge, interstitial filling ratio, charge ratio, and mill filling, are derived empirically from these primary parameters to ensure consistent process control.
In this study, dry grinding experiments were conducted by systematically varying key parameters to ensure repeatability and reproducibility. The feed input was adjusted from 1000 g to 1700 g in 250 g increments, while the number of grinding balls ranged from 125 to 135, with increments of 10 balls. The grinding media weight was varied according to the ball size distribution, and grinding duration was adjusted between 5 and 12 min in 2.5-minute increments. Dependent parameters, including the mill volume fraction occupied by the sample charge, interstitial filling ratio, and mill filling, were calculated based on rock sample density and mill volume to maintain experimental consistency. The mill volume fraction occupied by the ball charge was determined using the density of the grinding media and the available mill volume, providing a robust framework for evaluating grinding performance across different conditions. Table 2 summarizes the variations in ball mill operating parameters during grinding experiments, while Table 3 details the ball size distributions used across different experimental phases. Certain operating parameters of the ball mill in Table 2 are determined using the following expressions in Eq. (13) to Eq. (18).
$$\:{\text{J}}_{\text{r}}{\:=}\:\frac{\frac{{\text{m}}_{r}}{{{\uprho\:}}_{r}}}{{\text{V}}_{\text{m}\text{i}\text{l}\text{l}}}\text{*}\frac{100}{{1-\varepsilon}}$$
(13)
$$\:{\text{J}}_{\text{b}}{\:=}\:\frac{\frac{{\text{m}}_{b}}{{{\uprho\:}}_{b}}}{{\text{V}}_{\text{m}\text{i}\text{l}\text{l}}}\text{*}\frac{100}{{1-\varepsilon}}$$
(14)
$$\:{\omega\:=}\:\frac{{\text{J}}_{r}}{{\text{J}}_{b}}*\frac{\text{1}}{{\varepsilon}}$$
(15)
$$\:{\nu=}\:\frac{{\text{m}}_{b}}{{\text{m}}_{r}}$$
(16)
$$\:{\uppsi=}\:\frac{\left(\frac{{\text{m}}_{r}}{{{\rho}}_{r}}+\frac{{\text{m}}_{b}}{{{\rho}}_{b}}\right)}{{\text{V}}_{\text{m}\text{i}\text{l}\text{l}}}*\frac{100}{{1-\varepsilon}}$$
(17)
Where mr is the mass of rock charge, mb is the mass of ball charge, ρr is density of rock charge, ρb is density of ball charge (ρb = 7.65 g/cc), Vmill is the mill volume and ε is bed porosity for ball mill (30–40%).
Figure 3 shows the sequence of steps involved in a ball mill grinding to obtain the particle sizes. The ground samples are subjected to sieve analysis for a duration of 10 min to determine their particle size distribution from which representative particle sizes such as D10, D50 and D90 indicate the particle diameters at which 10%, 50% and 90% of the particles by weight respectively are finer are obtained. The particle size distribution of ground limestone samples along with the descriptive statistics is presented in Table 4.

Steps in ball mill grinding to determine particle size distribution.
The variations in grinding characteristics are analyzed using the boxplots shown in Fig. 4. The line inside each box represents the median value for each variable. The box spans from first quartile (Q1) to third quartile (Q3) indicating the middle 50% of the data. The lines extending from the top and bottom of each box represent the range of the data within 1.5 times the IQR (Inter Quartile Range) from Q1 to Q3.

Boxplot of grinding characteristics of ball mill.
For feed input most of the data is skewed beyond the median value of 1350 with little variation between Q3 and the maximum value indicating a concentration of data points toward the higher range. Regarding the number of balls, the median value is 143 with a minimum value of 125. Notably Q1 coincides with the minimum value, suggesting lower values of data is centered in this range. The grinding media weight displays relatively long whiskers indicating a higher spread of data with most points dispersed above the median value of 26,510. The variables mill volume by sample charge fraction, mill volume by ball charge fraction and interstitial filling ratio exhibit symmetric distributions with median of 0.988, 6.42 and 0.414 respectively suggesting a balanced spread around the central values. For grind time, charge ratio and mill filling the distributions vary significantly with medians of 10 min, 20.6 and 7.33 respectively. A large whisker is observed for mill filling indicating substantial variation, with values ranging from a minimum of 4.718 to a maximum of 10.885. In the case of representative particle sizes D10, D50 and D90 the medians are 51.75 μm, 220.1 μm and 4490.8 μm respectively. A wider variation is noted for D50 with relatively shorter whiskers, indicating tighter clustering of values. The spreads vary with D10 having the smallest spread and D90 having the largest reflecting greater variability in the coarser particles. However, it is essential to note that the grinding characteristics of ball mill depend on additional factors such as physico-mechanical properties of material being ground, mineralogical and textural characteristics as well as various other operating parameters of mills.
Correlation analysis between grinding characteristics and uniaxial compressive strength
The performance of ball mills in various industrial processes rely on a combination of strength properties of materials and operating parameters. Understanding the relationship between these factors is crucial for optimizing the performance of the mills, enhancing production output and achieving desired product quality38. The correlation analysis between uniaxial compressive strength and the grinding characteristics of ball mill based on the Pearson correlation coefficient is shown in Fig. 5. The operating parameters such as feed input, number of balls, grinding media weight, grind duration, mill volume fraction occupied by sample charge, mill volume fraction occupied by ball charge, interstitial filling ratio, charge ratio and mill filling exhibit moderate to weak correlations among each other and with UCS. While the representative particle sizes D10, D50 and D90 displayed stronger negative correlations with UCS. The reason for this phenomenon may be attributed to the fact that when the particle sizes decrease during ball mill grinding, the surface area of the particles increases significantly. Smaller particles tend to pack more densely, allowing for more efficient bonding between mineral grains. The reduction in voids and better interlocking between particles create a more compact structure, which can enhance the strength of rock when subjected to compressive loads this results in a higher UCS as the rock resists fracture more effectively.

Correlation matrix of grinding characteristics of ball mill and uniaxial compressive strength.
As evident from the correlation matrix, only a limited subset of grinding characteristics significantly influences the uniaxial compresses strength as indicated by the higher values of Pearson correlation coefficient values (|r| > 0.5). Identifying these critical variables is essential for reducing the model complexity, mitigating overfitting, and improving computational efficiency in predictive modelling. Additionally, the grinding characteristics exhibit values spanning multiple orders magnitude, which can introduce bias during the model training due to the disproportionate influence of features with larger scales. To address this, data pre-processing techniques such as normalization (min-max scaling, Z-score standardization) or log transformation are applied to ensure uniform feature scaling. Once the data transformation is complete the next step involves elimination of features to further refine the model. Various feature selection methods have been proposed in the literature, including filter methods (e.g., correlation-based feature selection), wrapper methods (e.g., recursive feature elimination with cross-validation) and embedded methods (e.g., LASSO regularization). Hybrid approaches such as support vector machines-recursive feature elimination (SVM-RFE), have gained significant attention in rock engineering applications due to their ability to combine the strengths of filter and wrapper methods. SVM-RFE, in particular leverages the margin- maximization property of SVMs to iteratively eliminate less important features, thereby enhancing the model interpretability and performance.
Data Pre-Processing
To diminish the impact of varying orders of magnitude and dimensions of various grinding characteristics of ball mill and compressive strength, the dataset obtained through experimentation is subjected to Min-Max normalization. This normalization method is used to make all input and output feature samples within a uniform scale, mapping them to common range of 0 to 1 through linear transformation is shown in Eq. (18).
$$\:{X}^{{\prime\:}}=\:\frac{X-{X}_{min}}{{X}_{max}-{X}_{min}}\:$$
(18)
Where X is one of the parameters, X’ represents the normalized metric of X, Xmax and Xmin represents the maximum and minimum values of the parameters x respectively.
Feature selection using hybrid support vector Machines-Recursive feature elimination method
In a high-dimensional small sample datasets, especially when the number of features (variables) is much larger compared to the number of observations (samples) certain challenges arise. As evident from the present study, there are total twelve input variables (grinding characteristics) and one target variable (uniaxial compressive strength) for a total of 82 samples. The problems that stem from such datasets include overfitting, increased model complexity and reduced interpretability. Many features might not contribute to the prediction of the target variable and their presence can introduce noise of redundancy. To address these challenges, it becomes essential to eliminate features that do not significantly influence the target variable.
The hybrid Support Vector Machines-Recursive Feature Elimination (SVM-RFE) method was applied for feature selection in this study due to its ability to effectively handle complex non-linear relationship between features and the target variable. By integrating the discriminative strength of SVM with the iterative elimination approach of RFE, SVM-RFE ranks features based on their contribution to model performance39. Compared to Pearson correlation filtering, SVM-RFE offers distinct advantages. While Pearson correlation filtering, as examined in the correlation matrix analysis, effectively detects linear relationships among variables, it inherently assumes linearity and feature independence. However, in the context of ball mill grinding, a few operating parameters and particle size distribution metrics (D10, D50, D90) often exhibit non-linear interactions and multicollinearity. This is evident from the weak correlations observed between D10, D50, and D90 with operating parameters, whereas their correlation with UCS is significantly stronger, with coefficients ranging from 0.91 to 0.95, as illustrated in Fig. 6.
SVM-RFE, approach offers a superior mechanism for feature selection by harnessing the capability of SVMs to model non-linear relationships through kernel functions, making it highly effective for capturing the intricate dependencies governing the grinding process. Unlike traditional correlation-based methods, SVM-RFE systematically evaluates features within the context of the predictive model, iteratively eliminating those with minimal contribution based on their weights in the SVM. This ensures that the retained features are not only individually relevant but also collectively optimized for predicting outcomes like particle size distribution and grinding efficiency. In contrast, Pearson correlation filtering, primarily assess pairwise linear relationships with the target variable, potentially disregarding complex interactions that significantly influence model performance. For instance, although the correlation matrix showed a strong negative correlation between uniaxial compressive strength and particle sizes (−0.89 to −0.93), SVM-RFE assigns a lower rank to this feature. This suggests that the other factors such as interstitial filling ratio contribute more to the predictive model when considered holistically.
Despite the relatively limited dataset size of 82 samples, SVM-RFE remains a robust choice due to the intrinsic resilience of SVMs against overfitting, particularly when complemented by appropriate regularization strategies and kernel selection (e.g., radial basis function kernel). While Pearson correlation filtering offers computational efficiency for larger datasets characterized by predominantly linear dependencies, its assumptions may lead to oversimplification in scenarios where complex interactions dictate model behavior. To address potential overfitting challenges associated with the small dataset, cross-validation was employed during SVM-RFE execution, ensuring the reliability and generalizability of feature rankings. The below pseudo-code outlines SVM-RFE to rank variables based on their importance in predicting a continuous target variable.

Initially to establish the relative importance of input variables for SVM-RFE, a support vector regression (SVR) model with a linear kernel is applied. This SVR model is combined with RFE to rank the feature. The process begins by training the SVR model on the entire set of input variables and obtaining the coefficients associated with each variable from the linear kernel. The RFE process works by recursively eliminating the least important features based on these coefficients. In each iteration, the SVR model is retained after removing the weakest features, gradually refining the set. This recursive elimination continues until the most relevant variables remain and a ranking of all the variables is established based on their contribution to the performance of model. The feature ranking of all the grinding characteristics is shown in Fig. 7.

Importance measure of grinding characteristics based on Pearson correlation filtering.

Assessment of feature ranking using SVM-RFE method.
Once the feature ranking is complete, the next step is to assess the importance of the ranked features. This involves examining the strength of each variable’s contribution (usually their coefficients) in predicting the uniaxial compressive strength. The feature importance is derived from the iterations of SVM-RFE process, where variables with higher rankings are identified as more significant in influencing UCS and the variables importance is shown in Fig. 8. The critical identified parameters are milling volume fraction occupied by sample charge, charge ratio, mill filling and representative particle sizes D10, D50 and D90. Only the most significant variables-those that substantially affect UCS- are retained for further model building. These variables are then used as inputs for developing machine learning models, ensuring that the models are trained on the most influential data, leading to the accurate and more efficient predictions.

Impact of selected grinding characteristics based on importance scores on uniaxial compressive strength.
