Machine learning for microscopy data analytics targeting real-time optical characterization of semiconductor nanocrystals

Machine Learning


Clustering protocol for blinking trajectories

The UML-PSD method is primarily applied to segregate and analyse wide-field microscopy generated PL blinking profiles. A miscellaneous test dataset is constructed with major blinking characteristics (PEB, TSB, PBB) comprising 10,000 datapoints (dimensionality), acquired from randomly chosen 204 diffraction-limited spots of cesium lead bromide (CsPbBr3) NCs (Fig. 1a–b), considering various region of interests (ROI). The time resolution of the data acquisition has been set at 30 ms for each ROI to extract blinking data of total 300 s. We consider photobleaching and photo-brightening trajectories to originate from single nanocrystals, as it is highly unlikely that multiple nanocrystals within the same diffraction-limited spot would undergo bleaching or brightening with identical rates. The same applies to two-state blinking, which is widely recognized as the fingerprint of a single nanocrystal or molecule. Based on our expertise in charge carrier dynamics and blinking analysis of NCs (and MCs)59,60, we are confident that our methodology predominantly reflects single-nanocrystal blinking phenomena. The synthesis procedure of these dodecahedron NCs, the material characterization, the preparation of thin-films, and the data acquisition techniques are discussed in the Methods section. The transmission electron microscopy (TEM) data, the size distribution analysis, the X-ray diffraction (XRD) patterns, and the optical characterization of CsPbBr3 NCs through bulk absorbance, the PL spectroscopy, and the time-averaged wide-field image are presented in the Supplementary Fig. 9a–e.

Fig. 1: Schematic of the clustering strategy envisioned for PL blinking analyses.
Fig. 1: Schematic of the clustering strategy envisioned for PL blinking analyses.The alternative text for this image may have been generated using AI.

a Schematic illustration of wide-field epifluorescence microscopy, and (b) time-averaged region of interest (ROI) from a wide-field PL movie (.tiff file) of a CsPbBr3 NCs thin-film, acquired by EM-CCD (electron multiplying charge-coupled device) camera. c MATLAB-based automated module that extracts PL blinking data from spatially isolated diffraction-limited spots on the ROI, carefully chosen utilizing the ImageJ software. d Illustration of the K-means clustering process, where the accumulated PL trajectories are clustered in ‘N’ dimensional space, based on minimum Euclidian distance. e The labeled datasets of the PL blinking trajectories enabling comprehensive subpopulation statistics, followed by (f) cluster-wise power spectral density (PSD) analyses of the blinking data.

The segregation of unlabeled blinking patterns (Fig. 1c) sequentially follows three steps, (i) estimation of the optimum cluster (Kopt), (ii) classification of data, and (iii) cluster-wise PSD analysis of the labeled PL traces. For first two objectives, we deploy unsupervised K-means algorithm49,50,51,52,53,54 (Fig. 1d, Supplementary Note 1), utilizing a customized MATLAB (R2022a) code with inbuilt functions. K-means clustering relies on minimum (point-to-point) Euclidean distance (Supplementary Note 2) and is deliberately chosen as spectroscopy always generates pixel-wise blinking trajectories comprising consistent length of information without missing data. To estimate the Kopt and cluster the blinking patterns, we have constructed the Euclidean distance matrix based on normalized PL trajectories. By scaling all blinking traces to a fixed range (0 to 1), normalization enables shape-based comparisons which would not be achievable using raw intensity data. Next, information of the Kopt and the normalized blinking signals are fed into the K-means module and individual clusters are segregated. During K-means process, blinking traces with 10,000 observations are envisioned as a single point of 10,000 (N) dimensionality (Fig. 1d, Supplementary Fig. 10). The labeled trajectories (Fig. 1e) undergo PSD analysis (Supplementary Note 3, Fig. 1f), where distributions of PSDs are fitted with a power-law function and corresponding exponents (β) are extracted. Subsequently, cluster-wise β values are compared to infer different kinetics of PL fluctuations42,43,44,45. A schematic representation of UML-PSD methodology for PL microscopy and blinking analysis is depicted in Fig. 1.

The optimum number of clusters (K
opt)

For this trial dataset, we execute the ‘Visual & Logical’ (V&L) module (see Methods) to estimate the Kopt value. It calculates statistical metrics, like mean Silhouette score ( )61,62,63,64, mean Calinski-Harabasz index ()65,66,67, and mean sum square deviation/error  (, with respect to the centroid)68,69,70, performing 100 iterations (epochs) of the code (Supplementary Notes 4–6), for trial clusters N = 2 to 15. We find that and profiles consistently decrease as the number of trial clusters increases, with an elbow break emerging at cluster 3 (Supplementary Fig. 11a,c,d,f). This characteristic of the profile likely originates from the temporal similarities between some of the PEB and TSB blinking trajectories (Supplementary Note 4, Supplementary Fig. 12). The elbow break at cluster 3 reflects the Kopt accurately indicating existence of three blinking categories in the ensemble, while continuously declines (Supplementary Fig. 11b,e) for increasing value of N with no elbow evident.

Segregation of PL trajectories

Based on the and metric profiles, the K-means algorithm is initiated for the normalized PL blinking dataset considering Kopt = 3. However, we observe an inefficient clustering with 34% misclassification (Supplementary Table 1). The misclassification is prominent for the PEB and TSB categories (Supplementary Fig. 13, cluster 2-3), implying their difficult segregation due to partially analogous PL progression of some trajectories. To quantify this resemblance, we incorporate another metric into the V&L module which measures the percentage of misclassification (%Mscl) by segregating the blinking traces that exhibit a negative SS value (SS < 0) within the ensemble. The SS < 0 signifies that the data has been misclassified into inappropriate clusters61. The mean %Mscl (<%Mscl >), averaged over 100 iterations, gradually rises (Supplementary Fig. 14) with a minimum value at cluster 2, instead of cluster 3 as anticipated by the profile. The discrepancy implies that some PEB and TSB traces are apparently similar, situating in the intersection between the datasets in 10,000-dimensional space (Supplementary Fig. 15). These points likely exhibit a reduced SS value, as increase in SS implies elongation of the Euclidean distance between the elements of two adjacent (nearest neighbor) clusters (Supplementary Note 4).

Further, we calculate the percentage of NC emitters (%EM) at different SS levels to evaluate the extent of overlap between blinking datasets. Average %EM (< %EM >) profiles are constructed for SS < 0 up to SS < 0.6, under 100 iterations, for trial clusters N = 2 to 15 (Supplementary Fig. 16). We find a sharp change in the slope of <%EM> profiles at cluster 3, considering SS < 0.3 and onwards, which is absent for the profiles corresponding to SS < 0.2 and SS < 0.1. Therefore, the span of intersection may roughly be estimated between 0.2 and 0.3 in the Silhouette space, and elements outside the intersection (SS > 0.3) are likely well-segregated among three blinking categories. At this point, although we refrain from concluding an exact extent of intersection, a sharp change in the slope of <%EM> profiles can be an indicator providing scope to automate the decision in the future. Altogether, our observation indicates that the <%Mscl> and K-means clustering are often influenced by similarities between high-dimensional PL trajectories. The V&L module took approximately 550 s, 970 s, and 190 s to perform the calculations related to SS, CH indexes, and SSD profiles (respectively), for the raw blinking dataset, considering 14 trial clusters and 100 iterations of the codes.

Effect of data processing on clustering efficiency

To reduce the ‘curse of dimensionality’ for long-time PL blinking data and the estimation time of the Kopt, we employ preprocessing (smoothing) of the blinking traces. We perform noise-flattening using discrete wavelet transformation (DWT)71,72,73,74 (Supplementary Note 7) and simple data-binning75,76 techniques. While DWT-based smoothing retains the actual length of blinking profiles, the binning method reduces both dimensionality and randomness of the data. An optimum data-processing is estimated (vide infra) without sacrificing essential blinking features, and thereby the underlying trend component is accentuated. The processed PL trajectories are then normalized to compare the patterns, creating a secondary input dataset. Next, the Kopt is estimated for this secondary dataset, and subsequently K-means clustering is executed. The resulting cluster-indexes guide the retrieval and segregation of the original blinking trajectories, followed by cluster-wise PSD analysis of the raw data.

As technically the DWT at wavelet-level 1 (WL1) returns the pristine signals, we use WL3 to WL7 to smooth the blinking data and recalculate the  , and metric profiles. Three representative PL intermittency trajectories (PEB, TSB, PBB), processed at different WLs are shown in Supplementary Fig. 17. The V&L module demonstrates that the profiles decline continuously (for all WLs) with an elbow break at cluster 3 (Supplementary Fig. 18a), representing the Kopt, which disappears for WL7 possibly due to over-smoothing of the trajectories. The values gradually increase for WL3 to WL6, as compared to the raw data, indicating improved distinguishability of the clusters. The maximum at the elbow is achieved for WL6, which simultaneously produce a minimum <%Mscl> (Supplementary Fig. 18b) for WL6 (0.32%, out of 100 iterations). It is important to note here that the minimum misclassification at cluster 3 is evident specifically for the WL-smoothed traces, indicating importance of preprocessing on their distinguishability in the ensemble. The profiles further exhibit a prominent elbow (Supplementary Fig. 18c) at cluster 3, which is however not evident in case of the unprocessed dataset. Once again, the maximum value of the at the elbow is observed for WL6. This is in tune with the normalized profiles which yield the elbow at cluster 3, being minimum for WL6 (Supplementary Fig. 18d). Hence, the V&L module suggests Kopt = 3 and WL6 to be the optimum scale of data smoothing. Next, the WL6-smoothed blinking traces are subjected to the K-means clustering module and a significant improvement in clustering efficiency (Supplementary Fig. 19) is noted with an accuracy of 88.7% (Supplementary Table 2). The analysis times for SS related calculations, CH index, and SSD profiles reduce to approximately 460 s, 580 s, and 157 s (respectively), for 100 iterations and 14 trial clusters.

To further decrease the estimation time of Kopt, which is crucial to segregate the microscopy signals in real-time, we reduce the dimensionality of PL traces by data-binning without compromising the essential blinking signatures. Binning windows of W = 10, 20, 25, 40, 50, 80, 100, 125, 200, 250, 400 and 500 frames (equivalent to W×30 ms time-bins) are applied to the original blinking profiles and the ideal scenario is estimated relying on the  , and metric profiles. Three representative PL trajectories, binned by different frame windows, are shown in the Supplementary Fig. 20. For each binned dataset, the calculated SS values are much higher compared to the pristine blinking signals. The profiles decline with increasing number of trial clusters after an elbow at cluster 3 (Fig. 2a). It is intriguing that the blinking data binned by W = 40, 50, 80, and 100 frames, display maximum value at the elbow point (Fig. 2a), which is highest for the binning window of 80 frames. This condition produces least <%Mscl> (0.044 out of 100 iterations) at cluster 3 (Fig. 2b), which is even lower compared to smoothing by WL6 (Supplementary Fig. 18b). Again, the minimum misclassification at cluster 3 is achieved for the binned trajectories, instead of raw traces, which directly imply relevance of data-smoothing on the distinguishability of patterns in the ensemble. In agreement, profiles comprise the elbow with a maximum value (Fig. 2c) at cluster 3 for the datasets binned by 25, 40, 50, 80, 100 frames, which is certainly not present in case of indigenous PL blinking trajectories. The value at the elbow is highest for the binning window of 80 frame. The elbow also appears in the declining profiles (normalized) at cluster 3 (Fig. 2d), with a minimum value evident for the binning window of 80 frames. Unnormalized profiles are demonstrated in Supplementary Fig. 21. Thereby, the metrices estimate the Kopt = 3 and optimum binning window to be 80 frames. The representative blinking traces processed by WL6 and a bin window of 80 frames are shown in the Supplementary Fig. 22. The V&L module records the analysis time to be approximately 16 s, 28 s, and 8 s for the SS, CH index and SSD related calculations, considering 100 iterations and 14 trial clusters. A significant 20-to-30-fold reduction in the estimation time of Kopt as compared to the pristine and DWT (WL6) processed PL trajectories is particularly intriguing. Next, the binned (by 80 frames, 2.4 s) blinking data is introduced to the K-means module, which shows an increase of clustering efficiency (Supplementary Fig. 23) with 89.2% accuracy (Supplementary Table 3). Altogether, our results establish the advantage of data preprocessing, particularly the (time) binning approach, to achieve efficient and rapid clustering of blinking trajectories. We emphasize here that the normalization of blinking patterns is an essential step for shape-dependent clustering. Accurate clustering of unnormalized raw blinking data is challenging, as the wide variation in intensity values leads to inevitable misclassifications even after standard data pre-processing (Supplementary Figs. 24–25).

Fig. 2: Estimation of the optimum number of clusters (Kopt).
Fig. 2: Estimation of the optimum number of clusters (Kopt).The alternative text for this image may have been generated using AI.

ad The  , <%Mscl > ,  , and metric profiles, considering 100 iterations of the code, are calculated for both unprocessed and the binned (by frames) PL traces following data normalization. Metric profiles are calculated for the data binned by 10, 20, 25, 40, 50, 80, 100, 125, 200, 250, 400, and 500 frames and compared with that for the pristine (raw) blinking trajectories (in black). The ideal window of binning is estimated at 80 frames, with the corresponding average metric profiles represented in cyan, as highlighted in each panel. The insets display zoomed view of the average metric profiles with the Kopt indicated therein. Average metric profiles for the raw traces (in black) and optimally binned traces (binned by 80 frames, in cyan) are kept in bold.

Cluster-wise power spectral density analysis

After clustering the binned PL trajectories, the algorithm retrieves original (raw) blinking traces (10,000 dimensionality, of 300 s) based on their cluster index and categorically performs PSD analysis. For each cluster of trajectories (Supplementary Fig. 26a–c), we find that the distribution of PSDs exhibits a power-law dependence over three orders of magnitude in both the frequency and PSD domains. The power-law dependency of the PSDs can be explained within the framework of the MRC model42,43, suggesting that the rate of switching cycle between active and passive modes of NR traps spans at least three orders of magnitude. This is likely because different types of NR traps are transiently operational in individual nanocrystals.

The exponent (β) of the power-law characteristic essentially relates to the information regarding the frequency of PL intensity oscillation and their contribution to the blinking trajectory, which can be attributed to the crystal defects and nature of the traps44,45. For instance, blinking traces incorporating major contributions from the high-amplitude intensity oscillations increase the β value, presumably due to long-lived deep NR traps associated with the MRCs. Besides, PL fluctuations resembling a ‘shot noise’ characteristic can significantly reduce the β value, which can be referred to the fast switching between short-lived, presumably shallow trap states45. The complete diagram of the UML-PSD workflow is illustrated in Fig. 3a–d. A considerable number of PEB and TSB traces exhibit an exponent value greater than 1 and their means (<β >) are equivalent (Supplementary Fig. 26d), which indicates similarity in their PL fluctuations (Supplementary Fig. 12). This aligns with the conjecture of intersecting PEB and TSB datasets, as derived from SS-based calculations. Active deep traps can be anticipated for both cases, which is mostly evident in PEB trajectories after prolonged photoexcitation. A considerable number of PBB trajectories show β values much less than 1, implying fast fluctuation of MRCs, likely associated with shallow traps, below the temporal resolution (30 ms) of blinking data acquisition.

Fig. 3: Sequential schematic of the UML-PSD process.
Fig. 3: Sequential schematic of the UML-PSD process.The alternative text for this image may have been generated using AI.

a The wide-field PL images of the ROI, and automated (or manual) extraction and accumulation of blinking trajectories from spatially-localized diffraction limited emitters. b The subsequent V&L interface optimizes the level of data pre-processing (here binning) based on selected statistical metrics and determines the Kopt for the ensemble. c Depiction of the UML (K-means) module where the processed (binned) data and the estimated Kopt are fed into, and the individual clusters are constructed. d The subsequent PSD module retrieves the original blinking trajectories depending on the cluster indexes and performs cluster-wise PSD analysis on the raw data. The PSD distributions (with standard errors) are fitted with the power-law model to extract the exponent (β) values. The PSD distributions of three representative blinking traces from each cluster, along with the corresponding β values, are shown in the subpanels.

Challenges with multi-state blinking trajectories

Besides typical blinking signatures, multi-state blinking profiles (MSB, Supplementary Fig. 9e) are often evident for nanocrystals, even though we carefully discarded the clusters (or aggregates) from the analysis (see Methods, Data extraction). Importantly, it is difficult to understand whether MSB characteristics originate from a single or multiple nanocrystals present within the diffraction limit. This warrants additional experiments like combined Fluorescence and Scanning Electron Microscopy (Fluo-SEM), antibunching experiments etc., which are time consuming to perform for a large ensemble of nanocrystals. This necessitates segregation of MSB trajectories as these are complex to analyze and interpret in terms of charge carrier dynamics. We find clustering of the blinking features become challenging when multi-state blinking profiles of identical dimensionality (10,000 datapoints) are deliberately introduced within the existing ensemble of PBB, TSB, and PEB trajectories. For this new dataset of 235 PL traces (unprocessed), the profile reveals an elbow at cluster 3 (Kopt) which is however not clear from the  ,  , and <%Mscl> profiles (Supplementary Fig. 27). The <%EM> exhibits an abrupt rise at cluster 3 within the range SS > 0.2 to SS < 0.6 (Supplementary Fig. 28), implying the existence of three intersecting datasets. Considering Kopt = 3, the clusters of raw trajectories demonstrate significantly misclassified blinking traces (Supplementary Fig. 29). Even upon preprocessing of PL trajectories with WL6 and binning by 80 frames, as optimized in the earlier sections, the existence of four sub-categories of blinking remains elusive. Rather, the V&L module suggests Kopt = 3 in each case (Supplementary Figs. 30–31), and K-means provides misclassified MSB trajectories throughout the clusters (Supplementary Figs. 32–33). In fact, manual consideration of Kopt = 4 in the K-means algorithm has been incapable to segregate the MSB behavior (Supplementary Figs. 34–36), considering both the raw and processed datasets. This indicates that the MSB profiles are distinct, resembling neither themselves nor traditional blinking patterns, and can be considered as potent outliers.

Here, we attempt to distinguish MSB patterns from the mainstream PL traces, depending on β values of the power-law fitted PSD distributions. As MSB often entails multiple intensity levels with defined amplitudes, it is likely that few deep trap states contribute to these fluctuations. Consequently, a higher β value is anticipated according to MRC model42,43,44,45. We find β = 1.24 for the representative MSB trajectory (Supplementary Fig. 37a,b), and the <β> for this sub-category is usually higher as compared to the others. This provides an opportunity to differentiate the outlier MSB trajectories to some extent, which warrants further investigation. In this context, a supervised machine learning based classification of blinking signals, instead of clustering, could be a potential alternative. However, it requires a substantial number of training datasets comprising various blinking patterns, which is yet to construct using the existing domain knowledge.

Assessment of synchronously blinking crystal grains

The unsupervised clustering algorithm further showcases application in identifying the crystal grains within the assembly (thin-film) that exhibit spatially synchronous PL blinking – a technique to measure the extent of communication between photoexcited distal charge carriers16,17,18,55,56,57. This communication relates to the mobility and diffusion length of the carriers, which can extend up to several microns for MHP systems18,60. Therefore, spatial synchronicity of PL blinking (or lack of) can unravel (dis)similarity between grains in terms of (in)homogeneous crystal orientation, energy landscape (traps) and the nature of grain boundaries. Typically, the spatial synchronicity of blinking is quantified by Pearson’s correlation coefficient (PCC, Supplementary Note 8) between the PL trajectories of individual pixels as a function of distance. The analysis generates a correlation map of the assembly where each pixel demonstrates the corresponding PCC value18,77,78. Therefore, higher pixel values throughout distal nanodomains indicate a longer diffusion length of carriers, either in presence of fewer NR traps or permeable grain boundaries, suggesting excellent optoelectronic quality of the material (thin film). Recently, spatially correlated blinking has been utilized as a tool to investigate the carrier mobility within an assembly of Cl-treated MAPbI3 crystals and thin-films (Supplementary Fig. 38a)16. The corresponding correlation maps (Supplementary Fig. 38b) revealed nanodomains with comparable (or contrasting) pixel-wise PCC values, inferring spatially diverse photo-physics.

Nonetheless, PCC depends on two factors, (i) the trend (gradient) and (ii) the noise (fluctuation) of the signal. Therefore, two crystals undergoing different PL fluctuations, however, a similar intensity trend (PEB or PBB), can even show a significant correlation (Supplementary Fig. 39). Moreover, temporal similarity in PL trajectories can also lead to high intercorrelation as observed for the representative crystals 1 and 2 (PCC = 0.83) in Supplementary Fig. 38c,d. Hence, the pixel-wise PCCs become comparable, and grains become challenging to distinguish from the correlation map (Supplementary Fig. 38b). In such cases, two adjacent crystal grains can be misclassified as a single grain. This could bias the information on spatially (in)homogeneous charge carrier dynamics, indicating a limitation and lack of resolution in correlation (covariance)-based deterministic analyses.

Here, the K-means algorithm can be useful to spatially cluster the pixels depending on minimum Euclidean (point-to-point) distance between corresponding PL blinking trajectories. The UML is applied on the crystal assembly shown in the Supplementary Fig. 38a16, where a time-averaged image (Fig. 4a) of the ROI is initially constructed. Next, the relevant pixels are selected above an intensity threshold (noise count, 200 cts) and background pixels are set to zero. The PL trajectories (6000 dimensionality) of selected pixels are extracted from the blinking movie (.tiff file) and the input dataset of PL trajectories is structured. This treatment reduces the participation of unwanted pixels and background noise during the clustering process. The V&L interface bins the input trajectories by W = 10, 20, 25, 30, 40, 48, 50, 60, 75, 80 and 100 frames (equivalent to W×100 ms time-bins), and determines the optimum binning window as well as the Kopt for the ensemble. This estimation is performed relying on the and <%Mscl> metrics (Fig. 4b), which are proven beneficial in the earlier sections. The profiles show a declining trend and an elbow break at cluster 3 (Kopt) for both the original and binned PL trajectories. The declining trend owes to the similar blinking propensity of crystal 1 and 2, as already evident from the correlation maps (Fig. S28c). The <%Mscl> profiles suggest the optimum binning window at 40 frames that exclusively show minimum misclassification at cluster 3. Afterward, blinking data binned by 40 frames are subjected to the K-means module and the pixel-wise blinking characteristics are clustered. The cluster-map (Fig. 4c) of pixels represents three crystal grains at their respective spatial locations. It is worth noting that clustering of the unprocessed blinking traces, considering Kopt = 3, incorporates misclassified pixels as observed in Supplementary Fig. 40. This further highlights the relevance of data smoothing and dimensionality reduction to achieve a precise cluster-map.

Fig. 4: The UML-PSD workflow for crystal assemblies.
Fig. 4: The UML-PSD workflow for crystal assemblies.The alternative text for this image may have been generated using AI.

a The time-averaged image of a PL movie (.tiff), recorded for three self-assembled MAPbI3 crystals, with background pixels set to zero. b Illustration of the V&L interface that extracts and normalizes the unlabeled blinking traces corresponding to relevant pixels. Calculation of the and <%Mscl> profiles for the normalized ensemble of PL trajectories determines the Kopt at cluster 3 and estimates the optimum binning window at 40 frames. Metric profiles for the raw traces (in green) and optimally binned trajectories (in violet) are shown in bold. c Schematic of the segregation (UML-PSD) module, where the UML section applies the K-means algorithm to cluster the pixels exhibiting analogous blinking behavior, generating a cluster map of the crystal assembly based on the ensemble of binned PL trajectories. d The initial stage of the PSD module retrieves the raw PL trajectories depending on the cluster indexes. e The subsequent part computes the PSDs for the labeled blinking traces, extracting the pixel-wise power-law exponents (β) to construct the corresponding β-maps for the clusters.

Next, the original blinking trajectories are retrieved based on the cluster indexes of pixels (Fig. 4d) and the PSD distributions of pixel-wise original PL trajectories are constructed. The power-law fitting parameters, for each pixel, are utilized to generate β-map of the entire assembly. Subsequently, the <β> value for each clustered domain is calculated. The cluster-map and β-map of the crystal assembly collectively offer valuable insights into the charge carrier communication, grain boundaries and spatially varied nature of the NR traps. A decrease in pixel-wise β values for the C3 domain suggests relatively lower contribution of high amplitude PL fluctuations (Fig. 4d), implying a lack of active deep NR traps in C3, in contrast to the C1 and C2 domains. The entire UML-PSD process, including data extraction, optimization of binning window, determination of the Kopt and K-means clustering, PSD analysis and reconstruction of cluster-map and β-map, typically requires a few minutes timescale, considering wide-field PL movie file of dimension ~3 × 3 µm2, comprising 400 pixels and 6000 dimensionality of blinking data per pixel.



Source link