Clustering protocol for blinking trajectories
The UML-PSD method is primarily applied to segregate and analyse wide-field microscopy generated PL blinking profiles. A miscellaneous test dataset is constructed with major blinking characteristics (PEB, TSB, PBB) comprising 10,000 datapoints (dimensionality), acquired from randomly chosen 204 diffraction-limited spots of cesium lead bromide (CsPbBr3) NCs (Fig. 1a–b), considering various region of interests (ROI). The time resolution of the data acquisition has been set at 30 ms for each ROI to extract blinking data of total 300 s. We consider photobleaching and photo-brightening trajectories to originate from single nanocrystals, as it is highly unlikely that multiple nanocrystals within the same diffraction-limited spot would undergo bleaching or brightening with identical rates. The same applies to two-state blinking, which is widely recognized as the fingerprint of a single nanocrystal or molecule. Based on our expertise in charge carrier dynamics and blinking analysis of NCs (and MCs)59,60, we are confident that our methodology predominantly reflects single-nanocrystal blinking phenomena. The synthesis procedure of these dodecahedron NCs, the material characterization, the preparation of thin-films, and the data acquisition techniques are discussed in the Methods section. The transmission electron microscopy (TEM) data, the size distribution analysis, the X-ray diffraction (XRD) patterns, and the optical characterization of CsPbBr3 NCs through bulk absorbance, the PL spectroscopy, and the time-averaged wide-field image are presented in the Supplementary Fig. 9a–e.

a Schematic illustration of wide-field epifluorescence microscopy, and (b) time-averaged region of interest (ROI) from a wide-field PL movie (.tiff file) of a CsPbBr3 NCs thin-film, acquired by EM-CCD (electron multiplying charge-coupled device) camera. c MATLAB-based automated module that extracts PL blinking data from spatially isolated diffraction-limited spots on the ROI, carefully chosen utilizing the ImageJ software. d Illustration of the K-means clustering process, where the accumulated PL trajectories are clustered in ‘N’ dimensional space, based on minimum Euclidian distance. e The labeled datasets of the PL blinking trajectories enabling comprehensive subpopulation statistics, followed by (f) cluster-wise power spectral density (PSD) analyses of the blinking data.
The segregation of unlabeled blinking patterns (Fig. 1c) sequentially follows three steps, (i) estimation of the optimum cluster (Kopt), (ii) classification of data, and (iii) cluster-wise PSD analysis of the labeled PL traces. For first two objectives, we deploy unsupervised K-means algorithm49,50,51,52,53,54 (Fig. 1d, Supplementary Note 1), utilizing a customized MATLAB (R2022a) code with inbuilt functions. K-means clustering relies on minimum (point-to-point) Euclidean distance (Supplementary Note 2) and is deliberately chosen as spectroscopy always generates pixel-wise blinking trajectories comprising consistent length of information without missing data. To estimate the Kopt and cluster the blinking patterns, we have constructed the Euclidean distance matrix based on normalized PL trajectories. By scaling all blinking traces to a fixed range (0 to 1), normalization enables shape-based comparisons which would not be achievable using raw intensity data. Next, information of the Kopt and the normalized blinking signals are fed into the K-means module and individual clusters are segregated. During K-means process, blinking traces with 10,000 observations are envisioned as a single point of 10,000 (N) dimensionality (Fig. 1d, Supplementary Fig. 10). The labeled trajectories (Fig. 1e) undergo PSD analysis (Supplementary Note 3, Fig. 1f), where distributions of PSDs are fitted with a power-law function and corresponding exponents (β) are extracted. Subsequently, cluster-wise β values are compared to infer different kinetics of PL fluctuations42,43,44,45. A schematic representation of UML-PSD methodology for PL microscopy and blinking analysis is depicted in Fig. 1.
The optimum number of clusters (K
opt)
For this trial dataset, we execute the ‘Visual & Logical’ (V&L) module (see Methods) to estimate the Kopt value. It calculates statistical metrics, like mean Silhouette score (
Segregation of PL trajectories
Based on the
Further, we calculate the percentage of NC emitters (%EM) at different SS levels to evaluate the extent of overlap between blinking datasets. Average %EM (< %EM >) profiles are constructed for SS < 0 up to SS < 0.6, under 100 iterations, for trial clusters N = 2 to 15 (Supplementary Fig. 16). We find a sharp change in the slope of <%EM> profiles at cluster 3, considering SS < 0.3 and onwards, which is absent for the profiles corresponding to SS < 0.2 and SS < 0.1. Therefore, the span of intersection may roughly be estimated between 0.2 and 0.3 in the Silhouette space, and elements outside the intersection (SS > 0.3) are likely well-segregated among three blinking categories. At this point, although we refrain from concluding an exact extent of intersection, a sharp change in the slope of <%EM> profiles can be an indicator providing scope to automate the decision in the future. Altogether, our observation indicates that the <%Mscl> and K-means clustering are often influenced by similarities between high-dimensional PL trajectories. The V&L module took approximately 550 s, 970 s, and 190 s to perform the calculations related to SS, CH indexes, and SSD profiles (respectively), for the raw blinking dataset, considering 14 trial clusters and 100 iterations of the codes.
Effect of data processing on clustering efficiency
To reduce the ‘curse of dimensionality’ for long-time PL blinking data and the estimation time of the Kopt, we employ preprocessing (smoothing) of the blinking traces. We perform noise-flattening using discrete wavelet transformation (DWT)71,72,73,74 (Supplementary Note 7) and simple data-binning75,76 techniques. While DWT-based smoothing retains the actual length of blinking profiles, the binning method reduces both dimensionality and randomness of the data. An optimum data-processing is estimated (vide infra) without sacrificing essential blinking features, and thereby the underlying trend component is accentuated. The processed PL trajectories are then normalized to compare the patterns, creating a secondary input dataset. Next, the Kopt is estimated for this secondary dataset, and subsequently K-means clustering is executed. The resulting cluster-indexes guide the retrieval and segregation of the original blinking trajectories, followed by cluster-wise PSD analysis of the raw data.
As technically the DWT at wavelet-level 1 (WL1) returns the pristine signals, we use WL3 to WL7 to smooth the blinking data and recalculate the
To further decrease the estimation time of Kopt, which is crucial to segregate the microscopy signals in real-time, we reduce the dimensionality of PL traces by data-binning without compromising the essential blinking signatures. Binning windows of W = 10, 20, 25, 40, 50, 80, 100, 125, 200, 250, 400 and 500 frames (equivalent to W×30 ms time-bins) are applied to the original blinking profiles and the ideal scenario is estimated relying on the

a–d The
Cluster-wise power spectral density analysis
After clustering the binned PL trajectories, the algorithm retrieves original (raw) blinking traces (10,000 dimensionality, of 300 s) based on their cluster index and categorically performs PSD analysis. For each cluster of trajectories (Supplementary Fig. 26a–c), we find that the distribution of PSDs exhibits a power-law dependence over three orders of magnitude in both the frequency and PSD domains. The power-law dependency of the PSDs can be explained within the framework of the MRC model42,43, suggesting that the rate of switching cycle between active and passive modes of NR traps spans at least three orders of magnitude. This is likely because different types of NR traps are transiently operational in individual nanocrystals.
The exponent (β) of the power-law characteristic essentially relates to the information regarding the frequency of PL intensity oscillation and their contribution to the blinking trajectory, which can be attributed to the crystal defects and nature of the traps44,45. For instance, blinking traces incorporating major contributions from the high-amplitude intensity oscillations increase the β value, presumably due to long-lived deep NR traps associated with the MRCs. Besides, PL fluctuations resembling a ‘shot noise’ characteristic can significantly reduce the β value, which can be referred to the fast switching between short-lived, presumably shallow trap states45. The complete diagram of the UML-PSD workflow is illustrated in Fig. 3a–d. A considerable number of PEB and TSB traces exhibit an exponent value greater than 1 and their means (<β >) are equivalent (Supplementary Fig. 26d), which indicates similarity in their PL fluctuations (Supplementary Fig. 12). This aligns with the conjecture of intersecting PEB and TSB datasets, as derived from SS-based calculations. Active deep traps can be anticipated for both cases, which is mostly evident in PEB trajectories after prolonged photoexcitation. A considerable number of PBB trajectories show β values much less than 1, implying fast fluctuation of MRCs, likely associated with shallow traps, below the temporal resolution (30 ms) of blinking data acquisition.

a The wide-field PL images of the ROI, and automated (or manual) extraction and accumulation of blinking trajectories from spatially-localized diffraction limited emitters. b The subsequent V&L interface optimizes the level of data pre-processing (here binning) based on selected statistical metrics and determines the Kopt for the ensemble. c Depiction of the UML (K-means) module where the processed (binned) data and the estimated Kopt are fed into, and the individual clusters are constructed. d The subsequent PSD module retrieves the original blinking trajectories depending on the cluster indexes and performs cluster-wise PSD analysis on the raw data. The PSD distributions (with standard errors) are fitted with the power-law model to extract the exponent (β) values. The PSD distributions of three representative blinking traces from each cluster, along with the corresponding β values, are shown in the subpanels.
Challenges with multi-state blinking trajectories
Besides typical blinking signatures, multi-state blinking profiles (MSB, Supplementary Fig. 9e) are often evident for nanocrystals, even though we carefully discarded the clusters (or aggregates) from the analysis (see Methods, Data extraction). Importantly, it is difficult to understand whether MSB characteristics originate from a single or multiple nanocrystals present within the diffraction limit. This warrants additional experiments like combined Fluorescence and Scanning Electron Microscopy (Fluo-SEM), antibunching experiments etc., which are time consuming to perform for a large ensemble of nanocrystals. This necessitates segregation of MSB trajectories as these are complex to analyze and interpret in terms of charge carrier dynamics. We find clustering of the blinking features become challenging when multi-state blinking profiles of identical dimensionality (10,000 datapoints) are deliberately introduced within the existing ensemble of PBB, TSB, and PEB trajectories. For this new dataset of 235 PL traces (unprocessed), the
Here, we attempt to distinguish MSB patterns from the mainstream PL traces, depending on β values of the power-law fitted PSD distributions. As MSB often entails multiple intensity levels with defined amplitudes, it is likely that few deep trap states contribute to these fluctuations. Consequently, a higher β value is anticipated according to MRC model42,43,44,45. We find β = 1.24 for the representative MSB trajectory (Supplementary Fig. 37a,b), and the <β> for this sub-category is usually higher as compared to the others. This provides an opportunity to differentiate the outlier MSB trajectories to some extent, which warrants further investigation. In this context, a supervised machine learning based classification of blinking signals, instead of clustering, could be a potential alternative. However, it requires a substantial number of training datasets comprising various blinking patterns, which is yet to construct using the existing domain knowledge.
Assessment of synchronously blinking crystal grains
The unsupervised clustering algorithm further showcases application in identifying the crystal grains within the assembly (thin-film) that exhibit spatially synchronous PL blinking – a technique to measure the extent of communication between photoexcited distal charge carriers16,17,18,55,56,57. This communication relates to the mobility and diffusion length of the carriers, which can extend up to several microns for MHP systems18,60. Therefore, spatial synchronicity of PL blinking (or lack of) can unravel (dis)similarity between grains in terms of (in)homogeneous crystal orientation, energy landscape (traps) and the nature of grain boundaries. Typically, the spatial synchronicity of blinking is quantified by Pearson’s correlation coefficient (PCC, Supplementary Note 8) between the PL trajectories of individual pixels as a function of distance. The analysis generates a correlation map of the assembly where each pixel demonstrates the corresponding PCC value18,77,78. Therefore, higher pixel values throughout distal nanodomains indicate a longer diffusion length of carriers, either in presence of fewer NR traps or permeable grain boundaries, suggesting excellent optoelectronic quality of the material (thin film). Recently, spatially correlated blinking has been utilized as a tool to investigate the carrier mobility within an assembly of Cl-treated MAPbI3 crystals and thin-films (Supplementary Fig. 38a)16. The corresponding correlation maps (Supplementary Fig. 38b) revealed nanodomains with comparable (or contrasting) pixel-wise PCC values, inferring spatially diverse photo-physics.
Nonetheless, PCC depends on two factors, (i) the trend (gradient) and (ii) the noise (fluctuation) of the signal. Therefore, two crystals undergoing different PL fluctuations, however, a similar intensity trend (PEB or PBB), can even show a significant correlation (Supplementary Fig. 39). Moreover, temporal similarity in PL trajectories can also lead to high intercorrelation as observed for the representative crystals 1 and 2 (PCC = 0.83) in Supplementary Fig. 38c,d. Hence, the pixel-wise PCCs become comparable, and grains become challenging to distinguish from the correlation map (Supplementary Fig. 38b). In such cases, two adjacent crystal grains can be misclassified as a single grain. This could bias the information on spatially (in)homogeneous charge carrier dynamics, indicating a limitation and lack of resolution in correlation (covariance)-based deterministic analyses.
Here, the K-means algorithm can be useful to spatially cluster the pixels depending on minimum Euclidean (point-to-point) distance between corresponding PL blinking trajectories. The UML is applied on the crystal assembly shown in the Supplementary Fig. 38a16, where a time-averaged image (Fig. 4a) of the ROI is initially constructed. Next, the relevant pixels are selected above an intensity threshold (noise count, 200 cts) and background pixels are set to zero. The PL trajectories (6000 dimensionality) of selected pixels are extracted from the blinking movie (.tiff file) and the input dataset of PL trajectories is structured. This treatment reduces the participation of unwanted pixels and background noise during the clustering process. The V&L interface bins the input trajectories by W = 10, 20, 25, 30, 40, 48, 50, 60, 75, 80 and 100 frames (equivalent to W×100 ms time-bins), and determines the optimum binning window as well as the Kopt for the ensemble. This estimation is performed relying on the

a The time-averaged image of a PL movie (.tiff), recorded for three self-assembled MAPbI3 crystals, with background pixels set to zero. b Illustration of the V&L interface that extracts and normalizes the unlabeled blinking traces corresponding to relevant pixels. Calculation of the
Next, the original blinking trajectories are retrieved based on the cluster indexes of pixels (Fig. 4d) and the PSD distributions of pixel-wise original PL trajectories are constructed. The power-law fitting parameters, for each pixel, are utilized to generate β-map of the entire assembly. Subsequently, the <β> value for each clustered domain is calculated. The cluster-map and β-map of the crystal assembly collectively offer valuable insights into the charge carrier communication, grain boundaries and spatially varied nature of the NR traps. A decrease in pixel-wise β values for the C3 domain suggests relatively lower contribution of high amplitude PL fluctuations (Fig. 4d), implying a lack of active deep NR traps in C3, in contrast to the C1 and C2 domains. The entire UML-PSD process, including data extraction, optimization of binning window, determination of the Kopt and K-means clustering, PSD analysis and reconstruction of cluster-map and β-map, typically requires a few minutes timescale, considering wide-field PL movie file of dimension ~3 × 3 µm2, comprising 400 pixels and 6000 dimensionality of blinking data per pixel.
