This study adopts a three-stage pipeline: dataset preparation, ceramic classification, and price prediction, as shown in Fig. 1. The stages are seamlessly integrated to support robust ceramic analysis and valuation.

This figure illustrates the full pipeline adopted in this study, comprising three sequential phases: data preparation, YOLOv11 based classification, and regression-based price prediction.
The first stage of our pipeline involved the construction and annotation of a high-quality ceramic image dataset, combining automated acquisition with expert-informed labeling strategies. Chinese ceramics have a rich history, with significant differences in craftsmanship, forms, and decorative patterns across various historical periods18. Building on the ceramic classification frameworks proposed by Mu et al.27 and Yi et al.29, this study categorized ceramics based on shape, decorative patterns, and production techniques. A high-quality dataset of 8213 high-resolution images was constructed, representing 20 distinct craftsmanship styles and decorative patterns selected for their historical significance and visual distinctiveness. These 20 styles include both kiln-specific categories and decorative techniques, such as Blue and White Porcelain, Doucai, Wucai, Fencai, Ru Kiln, Guan Kiln, Ge Kiln, Jun Kiln, Ding Kiln, Longquan Kiln, Yingqing Porcelain, White Porcelain, Sacrificial Blue/Red Porcelain, Langyao Red, Tea-dust Glaze, Reticulated Porcelain (Linglong), Cizhou Kiln Porcelain, Falangcai, and Fahua. The sample distribution among the 20 craft styles is moderately imbalanced. For instance, well-represented styles such as Blue and White Porcelain and Longquan Kiln Porcelain each contain over 600 samples, comprising approximately 8% of the dataset. In contrast, rarer types such as Ru Kiln Porcelain and Langyao Red Porcelain contain fewer than 100 images, each contributing less than 1.5% of the total. To mitigate potential issues related to class imbalance that may affect classification and regression stability, a set of data augmentation strategies, including image rotation, flipping, and brightness variation, was applied to underrepresented categories. This brought all classes closer to a more uniform distribution during model training. A detailed overview of both raw and augmented class distributions is provided, as shown in Supplementary Table 2.
Images were sourced from the following three channels. First, Auction houses (42.6%, 3500 images), including Christie’s, Sotheby’s, Bonhams, China Guardian, Poly Auction, and Beijing Rongbaozhai. Second, museums and cultural heritage databases (24.3%, 2000 images), such as the Palace Museum, the British Museum, the Metropolitan Museum of Art, the National Museum of China, and the ICOM database. Third, ceramic art stores and field photography (33.1%, 2713 images), obtained from platforms such as Taobao, Xianyu, Amazon, Pixabay, Wikimedia Commons, and private collectors. A hybrid data acquisition approach was adopted, for example, automated web scraping was performed using the Scrapy framework to extract structured ceramic image data from public databases. Meanwhile, manual photography was conducted in collaboration with ceramic experts and photographers to capture high-definition, high-value, and rare ceramic artifacts. All images were acquired in strict compliance with copyright regulations and are intended solely for academic research purposes.
Challenges encountered in data collection and corresponding solutions:
-
(1)
Issues with light reflection and shadow → Applied bilateral filtering for noise reduction, effectively preserving edge details while minimizing noise.
-
(2)
Interference from complex backgrounds → Utilized background segmentation algorithms, such as GrabCut, to remove distracting elements.
-
(3)
Inconsistent image resolution → Standardized all images to 1024 × 1024 pixels to ensure uniform model input and maintain data consistency.
For data annotation, this study adopted a hybrid approach combining AI pre-annotation and expert correction:
-
(1)
AI pre-annotation: a YOLO pre-trained model was utilized for initial object detection, automatically generating bounding boxes for ceramic contours.
-
(2)
Manual annotation and verification: a team of ceramic appraisal experts and data scientists refined the annotations using the LabelImg tool, following a structured, expert-informed guideline. Annotation was conducted across three hierarchical levels.
First-level classification: craftsmanship styles (e.g., blue-and-white porcelain, famille rose, doucai) were labeled based on characteristic features such as overglaze techniques and historical production periods. Experts referred to standard typologies drawn from authoritative museum collections (e.g., the Palace Museum) and academic literature to ensure consistent classification. As shown in Fig. 2, each craftsmanship style is contextualized within its corresponding dynastic period, from the Tang Dynasty to the modern era, highlighting the evolution of ceramic esthetics and kiln-specific innovations over time.

This figure visualizes 20 representative Chinese ceramic craftsmanship styles across seven historical periods, ranging from the Tang Dynasty (618–907 AD) to the Modern era (1913–2025). This figure includes elements that were redrawn or adapted from copyright-free sources such as Wikimedia Commons and Pixabay, ensuring no copyrighted content is used.
Second-level classification: vessel shapes (e.g., bottles, jars, plates, bowls, cups, pots) were defined according to neck-body proportion, base structure, and handle or spout presence. In cases of borderline shape types, consensus was reached through group review. Reference images were compiled into an internal labeling handbook to guide decisions, as shown in Fig. 3.

This figure illustrates the typological classification system used for ceramic vessel shapes, based on modular morphological parts. This figure includes elements that were redrawn or adapted from copyright-free sources such as Wikimedia Commons and Pixabay, ensuring no copyrighted content is used.
Third-level classification: decorative patterns (e.g., plants, animals, landscapes, portraits, geometric designs), as shown in Fig. 4. Rather than labeling individual motifs (e.g., lotus, peony, dragon), each image was annotated at the category level, based on the most visually dominant pattern types present in the overall design. For example, if a vessel featured both floral and tiger elements, the image was annotated as “plant” and “animal.” This multi-label, category-level annotation strategy balances annotation efficiency with classification relevance, enabling the model to learn from the dominant stylistic features without requiring exhaustive fine-grained motif annotation.
-
(4)
Supplementary attributes: additional features such as color complexity (monochrome vs. polychrome), structural intricacy (simple, moderate, intricate), and estimated price range (low-end collectibles vs. high-value antiques) were also annotated to enhance dataset richness. Structural intricacy was defined based on a combination of part count, curvature complexity, and decorative layering.
-
(5)
Annotation consistency and quality control: to ensure annotation consistency, all team members underwent a calibration phase using 300 sample images. Inter-Annotator Agreement (IAA) was assessed throughout the process, achieving a final Cohen’s Kappa coefficient of 0.91, indicating strong agreement and reliable label quality.

This figure presents a taxonomy of decorative patterns commonly found on Chinese ceramic artifacts, organized into six major categories: plant patterns, animal motifs, landscapes, human, crackled glaze patterns, and geometric designs. Each column shows representative visual motifs and subtypes based on iconographic content, glaze texture, or symbolic form. This figure includes elements that were redrawn or adapted from copyright-free sources such as Wikimedia Commons and Pixabay, ensuring no copyrighted content is used.
To ensure data quality and enhance model robustness, this study implemented a series of preprocessing techniques, including image preprocessing, bounding box optimization, price normalization, outlier detection, and feature extraction. The complete preprocessing formulas, parameter configurations, and empirical evaluation details are provided in the Supplementary Information.
First, the training dataset comprised 8213 images, which were partitioned into training (70%, 5749 images), validation (20%, 1642 images), and test (10%, 822 images) sets, ensuring balanced representation across the 20 defined craftsmanship styles. To further improve model robustness and evaluate generalization capability, 5-fold cross-validation was employed. Additionally, K-means clustering was used to optimize anchor box dimensions, thereby enhancing detection accuracy for ceramic objects of varying shapes. A cyclic learning rate scheduler was applied to stabilize gradient updates, while early stopping with model checkpointing was implemented to prevent overfitting, for example, training was terminated if validation loss failed to improve over 10 consecutive epochs. In each iteration, four folds (80%, 6571 images) were used for training and one fold (20%, 1642 images) for validation. Following data partitioning, images were normalized to the [0,1] range, filtered to reduce surface noise, and augmented using geometric (rotation, flipping, cropping) and color transformations (brightness adjustment, HSV conversion). To enhance the spatial generalization ability, Mosaic, GridMask, and MixUp image enhancement techniques were also applied for verification, as shown in Supplementary Table 1. Anchor box dimensions were optimized using K-means clustering to adapt to varying ceramic shapes, and non-maximum suppression (NMS) was employed to reduce redundant detection.
Second, for price data, outliers were detected and removed using the interquartile range (IQR) method, followed by log transformation to reduce skewness and Z-score normalization to account for cross-auction house variability. Key features, including color (GLCM), shape (Hu Moments), and decorative pattern encodings (one-hot), were extracted and reduced via Principal Component Analysis (PCA) to improve computational efficiency and maintain prediction accuracy.
Although these normalization techniques improved the numerical stability and convergence of the training process, they may introduce trade-offs. For instance, the log transformation compresses the scale of high-value artifacts, potentially reducing the model’s sensitivity in distinguishing between upper-tier price categories. Similarly, Z-score normalization across auction houses may obscure house-specific pricing nuances, such as branding premiums or regional valuation patterns. These decisions were made to ensure model robustness and minimize the influence of extreme values and scale inconsistencies.
Building upon the curated and preprocessed dataset, the second stage of our pipeline deployed an improved YOLOv11 model to classify ceramic images based on their decorative patterns, structural forms, and craftsmanship features.
The classification of ceramic artifacts involves a comprehensive analysis of decorative patterns, shapes, and craftsmanship techniques. However, the complexity of surface textures, diversity of artistic styles, and intricacy of manufacturing details pose significant challenges to traditional classification methods. Conventional techniques based on handcrafted feature extraction and rule-driven algorithms often rely on low-level features (e.g., edges, color histograms, and shape descriptors) and fail to effectively capture the subtle decorative differences inherent in ceramics, thereby limiting classification accuracy and effectiveness. Additionally, studies utilizing machine learning methods such as Support Vector Machines (SVM), Random Forests (RF), or K-Nearest Neighbors (KNN) have achieved partial improvements in classification performance. However, these methods still face limitations when addressing the fine structures of high-resolution ceramic images.
Recent breakthroughs in deep learning, particularly Convolutional Neural Networks (CNN) and their extensions in object detection, have made automated ceramic classification feasible. YOLOv11 (You Only Look Once, Version 11), one of the most advanced real-time object detection models, capable of simultaneously detecting multiple ceramic attributes such as decorative patterns, object shapes, and production techniques within a single image. Compared to traditional CNN-based classifiers, YOLO significantly optimizes the object detection process by integrating object localization and classification into a single forward pass, thereby reducing inference time while maintaining high accuracy. The core advantages of YOLO in ceramic classification tasks include: (1) Real-time detection. YOLO possesses end-to-end object detection capabilities, making it well-suited for large-scale ceramic classification applications, such as museum digitization, online antique authentication, and automated valuation systems. (2) Multi-object recognition. Ceramic artifacts often feature multiple decorative elements; for example, a single artifact may include floral motifs and geometric engravings. YOLO can simultaneously detect multiple categories, enhancing classification interpretability and robustness. (3) Efficient inference. YOLOv11 performs detection and classification within a single image simultaneously, offering greater computational efficiency compared to two-stage detection models such as Faster R-CNN and Mask R-CNN. This makes YOLO suitable for deployment on edge devices and mobile platforms.
For the improved YOLOv11 model architecture, it consists of three primary components: Backbone, Neck, and Head, each playing a critical role in the multi-object recognition process, as shown in Fig. 5.

This figure illustrates the structural design of the improved YOLOv11 framework used for ceramic classification. The architecture integrates a ResNet backbone, multiple feature enhancement modules, and optimized detection heads tailored for fine-grained ceramic attributes such as patterns, shapes, and craftsmanship styles. The entire figure was originally created by the authors.
To further optimize feature extraction efficiency and detection accuracy, this study integrates C3k2-EIEM (CSP with k2 convolution and Edge-Information Enhanced Module) into the YOLOv11 backbone, the C3k2-EIEM is presented, as shown in Supplementary Note 1. This enhancement improves edge detection, spatial feature retention, and overall classification performance.
The improved YOLOv11 employs ResNet50 as the backbone network to extract both low-level and high-level features from ceramic images. To improve feature representation and computational efficiency, this study introduces the following architectural enhancements:
-
(1)
C3k2-EIEM module: this module replaces conventional CSP (Cross-Stage Partial) blocks at the P3, P4, and P5 feature levels, enhancing the detection of fine-grained decorative details, such as engravings and inscriptions, by explicitly capturing edge information and preserving spatial details. This module consists of three key components: Edge Information Learning (SobelConv Branch), which integrates Sobel filters to refine edge detection, improving the recognition of object contours and decorative engravings; Spatial Feature Preservation (Conv Branch), which maintains high-resolution spatial details to ensure robust classification of intricate ceramic patterns; and a Feature Fusion Strategy, which combines edge-based and spatial-based features, resulting in a more comprehensive and discriminative object representation.
-
(2)
SPPF (Spatial Pyramid Pooling Fast): by performing multi-scale pooling, this module extracts ceramic object features at different scales, improving the detection of ceramics with varying sizes, such as plates, jars, and bowls.
-
(3)
C2PSA (Cross-Stage Partial Attention Mechanism): this module integrates channel attention and spatial attention to enhance the model’s sensitivity to intricate decorative patterns, such as underglaze painting and hollow carvings. By adaptively adjusting weights, the model focuses more effectively on critical decorative regions, reducing background interference.
The Neck component is responsible for aggregating feature information from different levels and further optimizing the fusion of deep and shallow features. The improved YOLOv11 integrates a Feature Pyramid Network (FPN) and Path Aggregation Network (PAN) in the Neck module. By combining the top-down feature propagation of FPN with the bottom-up feature enhancement of PAN, the model improves its ability to detect multi-scale ceramic objects. This ensures that the model maintains high precision when simultaneously recognizing large-scale objects (e.g., overall ceramic shapes) and small-scale decorative details (e.g., patterns and inscriptions).
The detection head (Head) of the improved YOLOv11 consists of three parallel output branches, each of which has been optimized to simultaneously detect large ceramic objects (e.g., vases, bowls, plates) as well as fine-grained features (e.g., floral patterns, geometric engravings, calligraphic inscriptions) while suppressing background noise (reducing false positives and improving bounding box localization). Optimization strategies include improved convolutional layers (Conv, k = 3, s = 2) to enhance bounding box prediction accuracy and reduce detection errors, as well as the C3k2 detail enhancement module to strengthen the model’s classification capability for decorative elements and improve its performance in complex backgrounds.
For the training of the improved YOLOv11 model, this study adopted its framework with architectural modifications tailored to the structural characteristics of ceramic artifacts. Specifically, the conventional Cross Stage Partial (CSP) blocks at feature pyramid levels P3, P4, and P5 were replaced with the C3k2-EIEM module. This module integrates edge enhancement, inter-scale feature fusion, and efficient spatial encoding, thereby improving the model’s ability to capture fine-grained decorative patterns and subtle craftsmanship traits. The loss function followed the standard YOLO composition, incorporating cross-entropy loss for multi-class classification, generalized IoU loss for bounding box regression, and quality focal loss (QFL) to address class imbalance and emphasize difficult samples. These components were combined using fixed weighting, ensuring a balanced optimization of both classification and localization objectives.
To enhance generalization and model robustness, advanced data augmentation strategies were employed during training. These included Mosaic augmentation (merging four images to increase contextual diversity), MixUp augmentation (blending two images to produce soft-labeled samples), and GridMask augmentation (applying structured occlusions to encourage feature robustness under partial visibility). The optimal training configuration consisted of a batch size of 16 and an initial learning rate of 1e−4 with a scheduled decay. A total of 200 training epochs were conducted using the Adam optimizer, with momentum set at 0.9 to stabilize gradient updates and reduce training oscillations.
The evaluation metrics of research focused on measuring classification accuracy, detection precision, and model robustness across different ceramic craftsmanship style, shapes, and craftsmanship attributes. Additionally, ablation studies were conducted to analyze the interpretability of the model and its alignment with price prediction, particularly by examining how attention-enhanced modules contribute to the identification of high-value features in ceramic artifacts.
(1) Classification metrics (pattern, shape, craftsmanship style): to evaluate the categorization of ceramic attributes, the following metrics were used:
Accuracy measures the overall classification performance, reflecting the proportion of correctly classified ceramic attributes.
$$Accuracy=\frac{TruePositives(TP)+TrueNegatives(TN)}{Total\,Number\,of\,Samples}$$
(1)
Precision measures the proportion of correctly classified ceramic features among all predicted instances.
$$Precision=\frac{TruePositives(TP)}{TruePositives(TP)+FalseNegatives(FP)}$$
(2)
Recall evaluates the model’s ability to correctly retrieve all relevant ceramic attributes.
$$Recall=\frac{TruePositives(TP)}{TruePositives(TP)+FalseNegatives(FN)}$$
(3)
The F1-Score is used as a balanced metric combining precision and recall, particularly suited for imbalanced ceramic categories. In this study, the Best-F1 score refers to the highest F1 value achieved across varying confidence thresholds (e.g., 0.0 to 1.0), with the optimal performance observed at a threshold of approximately 0.34.
$$F1\,Score=2\bullet \frac{Precision\,\times \,Recall}{Precision\,+\,Recall}$$
(4)
(2) Object detection metrics (Bounding Box Evaluation): since YOLOv11 performs both classification and object localization, it is essential to evaluate bounding box precision using the following metrics: Mean Average Precision at IoU 0.5 (mAP@50). This metric evaluates the model’s object detection performance by measuring the average precision when predicted bounding boxes have at least 50%. Intersection over Union (IoU) with ground truth, which reflects the model’s ability to correctly detect and classify ceramic elements under a moderate localization threshold. Higher mAP@50 scores indicate better accuracy in identifying and localizing decorative patterns such as floral, geometric, or calligraphic motifs.
(3) Ablation studies and robustness analysis of model: we conducted a series of ablation studies to assess the individual and combined effects of attention mechanisms on the model’s interpretability and price prediction accuracy. The classification performance of model was measured using accuracy, precision, recall, F1-score, and AUC, while a feature importance analysis was conducted to understand the key factors that influence the price estimation.
To support the interpretability goal of this research, particularly in bridging visual craftsmanship features and market valuation logic, this study integrates Gradient-weighted Class Activation Mapping (Grad-CAM) into the YOLOv11-based ceramic classification process. As a post-hoc visualization technique, Grad-CAM enables us to generate heatmaps that highlight the regions within ceramic images that activate the model’s attention most strongly during classification. In this study, Grad-CAM serves two specific purposes, first, to verify the effectiveness of the enhanced attention modules (C2PSA, C3k2-EIEM, and SobelConv) introduced into the YOLOv11 framework. By visualizing which image regions contribute most to the detection of decorative patterns, structural elements, or glaze details, the method confirms whether the model learns semantically meaningful patterns. Second, to establish an interpretability bridge between deep learning outputs and traditional expert valuation logic. The attention regions identified by Grad-CAM are cross-referenced with key features (e.g., manufacturing complexity, shape structure) used in Random Forest price prediction, confirming that the visual focus of the model aligns with empirically important price determinants.
In the third phase, we implemented a RF-based regression framework to predict market price categories based on extracted visual features. This component aimed to bridge the visual characteristics of ceramic artifacts with their appraised monetary value.
Valuing ceramic artifacts is a complex task influenced by multiple factors, including artistic craftsmanship, historical significance, and market trends. Traditional valuation methods rely heavily on expert assessments and historical auction data, which can introduce subjectivity and inconsistencies. To address these challenges, this study employs a regression-based machine learning approach to systematically predict the collectible value of ceramics.
Although ceramic prices are inherently continuous, this study adopts a classification-based prediction approach for practical and methodological reasons. First, auction price distributions are highly skewed and heavy-tailed, with a small number of exceptionally high-value items distorting regression outputs, as shown in Supplementary Table 3. Treating price as a continuous variable under these conditions often leads to poor generalization and unstable predictions, particularly for rare samples. Second, in real-world appraisal and auction settings, ceramic values are typically communicated in discrete price brackets (e.g., “less than $10,000”, “$10,001–100,000”, “$100,001–500,000”, “$500,001–1,000,000″, and “more than $1,000,000”), rather than as precise numerical values. The selection of these five price brackets was informed by both empirical auction practice and exploratory data analysis. We surveyed historical ceramic auction catalogs from major auction houses (e.g., Sotheby’s, Christie’s, Poly Auction), where such price groupings are routinely used to segment market levels. To validate this segmentation, we conducted a quantile analysis on the training data distribution, which revealed natural inflection points aligning with these ranges. Alternative schemes, such as equal-width bins or quartile-based grouping, were also tested during pilot runs, but resulted in lower classification accuracy and higher misclassification between adjacent categories. The final five-tier segmentation thus balances real-world interpretability with statistical alignment to price distribution characteristics, supporting both communication clarity and model performance.
While this classification-based approach aligns with industry practices, it is important to acknowledge that discretizing continuous price data involves trade-offs. Specifically, the conversion of continuous prices into discrete brackets can reduce the granularity of prediction and obscure subtle value differences between adjacent price levels. This assumption was made to improve model stability and interpretability in real-world applications such as auction valuation. However, alternative methods, such as hybrid classification, regression models, or ordinal regression, could be explored in future work to preserve more fine-grained price information while maintaining classification robustness. Lastly, classification models allow for clear evaluation using confusion matrices and AUC scores, which offer intuitive insights into misclassification patterns across value levels. Future work may explore hybrid models that combine categorical classification with probabilistic regression for finer-grained valuation.
To build a reliable and interpretable ceramic price prediction model, it is essential to identify and structure features that capture both the artistic attributes of artifacts and the dynamics of the auction market. This section summarizes the feature engineering process into the following four points.
-
(1)
Categorical features: to construct an effective price prediction model for ceramic artifacts, key features influencing valuation were identified and structured into categorical and numerical factors, ensuring a comprehensive and data-driven approach to ceramic price estimation. The classification attributes of decorative patterns, craftsmanship style, and kiln origins are crucial for capturing the artistic and historical significance of ceramics. Specifically, decorative patterns include plant motifs, geometric patterns, mythical creature designs, human figures, landscape motifs, animal patterns, linear stripes, and glaze surface decorations. Craftsmanship style refer to the processes that impart color and decorative effects to ceramics through different glaze formulations, glazing methods, and firing techniques, including monochrome-glazed ceramics and multicolored-glazed ceramics. Additionally, kiln origins such as Ru, Guan, Ge, Ding, Jun kilns, as well as Jingdezhen, Longquan, Yaozhou, and Cizhou kilns are incorporated. These classification features are processed using One-Hot encoding to convert them into machine-readable numerical representations.
-
(2)
Auction price normalization and outlier handling: the auction data used in this study were collected from six major auction houses (Christie’s, Sotheby’s, Bonhams, China Guardian, Poly Auction, and Rongbaozhai) and span a temporal range from 2000 to 2024, covering over two decades of ceramic artifact transactions. All auction prices were converted and normalized to 2024 US dollars (USD) using historical exchange rates and Consumer Price Index (CPI) data published by international financial databases such as the World Bank and OECD. To ensure stable model training and reduce the effect of extreme values, auction price data were cleaned using the Interquartile Range (IQR) method, with any values lying below Q1 − 1.5 × IQR or above Q3 + 1.5 × IQR flagged as outliers and excluded. Additional duplicate removal and price format normalization were performed. After cleaning, the remaining dataset included 7812 valid price-labeled samples. A ceramics auction price range summary table is shown in Supplementary Table 3, showing a right-skewed distribution with most items valued between USD 10,001 and USD 100,000, and fewer high-end pieces exceeding USD 1 million.
-
(3)
Quantitative features: in addition to categorical attributes, quantitative factors were incorporated to capture the physical and market-driven influences on valuation. These include three independent variables—shape, decorative motifs, and manufacturing complexity—and one dependent variable, price range. Furthermore, historical auction data from Sotheby’s, Christie’s, Poly Auction, and China Guardian were analyzed to integrate market trends, ensuring that price estimations reflect real-world demand fluctuations. To maintain numerical consistency and enhance model performance, all quantitative features were normalized using Min-Max Scaling. By systematically combining both artistic characteristics and empirical market data, the proposed model establishes a robust and interpretable framework for the valuation of ceramic artifacts.
To evaluate the contribution of different features to price prediction, we conducted an initial correlation analysis and feature importance assessment using RF’s built-in Gini importance ranking. Features with near-zero variance or strong collinearity (Pearson r > 0.9) were removed to reduce redundancy and mitigate overfitting risks. Furthermore, Principal Component Analysis (PCA) was applied to the normalized numerical feature space to improve computational efficiency. The first 10 principal components were retained, accounting for 92.7% of the total variance. This dimensionality reduction step ensured that the most informative aspects of shape complexity, glaze richness, and structural integrity were preserved, while reducing noise and irrelevant variations. Feature selection and encoding strategies were guided by both domain knowledge (e.g., auction expert feedback) and empirical analysis of model performance under different combinations of features.
-
(4)
Data encoding: to ensure effective integration of categorical and numerical attributes into the regression-based price prediction model, a structured data encoding strategy was implemented. Categorical features, including decorative patterns, craftsmanship style, and shapes, were processed using OHE to transform discrete, non-numeric values into binary feature representations. This approach prevents the model from imposing ordinal relationships on inherently non-ordered attributes, ensuring that categories such as floral patterns, geometric patterns, dragon motifs, and glaze-based decorations are treated as independent variables. Similarly, craftsmanship style and shapes were encoded using OHE, allowing the regression model to capture stylistic and historical variations without introducing artificial numerical relationships. Meanwhile, numerical attributes such as physical dimensions, integrity scores, and market-based valuation factors were normalized using Min-Max Scaling, ensuring that all numerical values were rescaled to a standardized range of [0,1]. This preprocessing step prevents scale imbalances, stabilizes model convergence, and preserves the relative influence of different valuation factors, resulting in a robust and interpretable regression model for ceramic price prediction.
For regression model training, an RF classification model was employed, as shown in Fig. 6, which illustrates the architecture and workflow used to ceramic price prediction. The pipeline is structured into four key stages: data preprocessing, feature extraction, ensemble training, and prediction. The pipeline is structured into four key stages: data preprocessing, feature extraction, ensemble training, and prediction. The RF algorithm constructs multiple decision trees during training and outputs the mode of the classes for classification tasks, which effectively reduces overfitting and improves generalization. RF achieves this by combining bagging and random feature selection: each decision tree is trained on a bootstrap sample of the data, and at each split, a random subset of features is considered. This ensemble strategy increases model diversity, reduces variance, and avoids overfitting, which is useful for high-dimensional, mixed-type datasets such as ours, where features span both categorical and continuous domains. Moreover, RF inherently supports multiclass classification tasks and does not require feature scaling, which simplifies integration with one-hot encoded decorative attributes and numerical complexity indicators.

This figure visualizes the full workflow of the Random Forest classification model used to predict ceramic price categories. The model integrates visual-semantic features extracted from improved YOLO 11 with historical market data, structured into a supervised ensemble learning framework. The entire figure was originally created by the authors.
It should be noted, however, that while RF offers a favorable balance between predictive performance and interpretability, it is not the only viable option for price prediction tasks. Ensemble learning models such as XGBoost and LightGBM can provide enhanced accuracy, better handling of class imbalance, and finer control over overfitting through gradient boosting mechanisms. In this study, RF was deliberately chosen to prioritize transparency and explainability, key considerations in the context of cultural heritage valuation, where trust and interpretability are critical. Nevertheless, future research could conduct a systematic comparison of ensemble methods to determine whether performance gains from more complex models justify trade-offs in interpretability and computational cost.
The dataset was first divided into training and testing sets using an 80/20 hold-out method, ensuring that 80% of the data was used for training while the remaining 20% was reserved for testing. The training set consisted of both categorical features (e.g., decorative patterns, kiln origins) and quantitative variables (e.g., shape, decorative motifs, manufacturing process). Categorical features were converted into numerical representations using OHE, and quantitative features were normalized via Min-Max Scaling to maintain consistency and improve model performance.
The RF model was trained using MATLAB’s TreeBagger function, which allows for flexible parameter tuning and efficient handling of large datasets. The training process involved iteratively adjusting key hyperparameters to optimize model performance: Number of Trees (numTrees): The number of trees in the forest was varied among 10, 50, 100, and 150. Increasing the number of trees generally improves performance but also increases computational cost. Maximum Depth (max depth): The maximum depth of each decision tree was set to 2, 5, 10, and 20, controlling the complexity of the model and preventing overfitting. Maximum Features (max features): The number of features considered for splitting at each node was tested using two strategies: one-third of the total features and the square root of the total features, aligning with standard practices for regression and classification tasks, respectively. Minimum Samples per Leaf (min samples leaf): This parameter was varied among 1, 5, 10, 20, 50, and 100 to control the minimum number of samples required at a leaf node. Smaller values tend to capture more intricate patterns, while larger values promote generalization.
After hyperparameter tuning via grid search, the final RF model, configured with 50 trees, a maximum depth of 50, two features per split, and a minimum of one sample per leaf, achieved a classification accuracy of 75.47% on the held-out testing set. This result represents the standalone predictive performance of the optimized model when applied to unseen ceramic artifact data.
Following model training, feature importance was assessed using the out-of-bag (OOB) permuted predictor importance provided by the TreeBagger function. This analysis quantified the contribution of each feature to the model’s predictive performance. The ranked feature importances were visualized using a bar plot, facilitating a clear interpretation of the factors influencing price predictions. By systematically tuning hyperparameters and analyzing feature contributions, the Random Forest model provided a robust and interpretable framework for ceramic price category prediction, integrating both artistic characteristics and market-driven data.
Although this study focuses on RF due to its balance of interpretability and performance, we conducted preliminary comparisons with Support Vector Machines (SVM) and logistic regression. These models yielded lower classification accuracies (SVM: 63.2%, Logistic Regression: 58.7%) and showed higher variance across folds.