Impact of solvent forces and broken symmetry on the assembly of designed proteins at a liquid-solid interface

Machine Learning


Assembly of DHR protein on mica

To investigate the role of interfacial structure on protein assembly, we utilized a rectangular rod-shaped de novo designed helical repeat (DHR) protein1 with an aspect ratio of 1:5.6— referred to as DHR10-mica18. Using the Rosetta de novo protein design platform, we previously designed this protein to interact with the (001) cleavage plane of mica by creating a protein scaffold having a flat surface and a regularly repeating backbone with a repeat spacing equal to an integer multiple of the 5.2 Å lattice spacing between nearest-neighbor K+ sites, which form a hexagonal sublattice (Fig. 1a, b)1. As indicated by the suffix “18”, this protein consists of 18 repeating subunits, each with three glutamate residues. Together they form an array of 54 carboxylate groups positioned to exhibit a structural match to the K+ sublattice (Supplementary Fig. 1). Due to this match, the proteins are expected to bind electrostatically to the K+ sites along the three equivalent directions (Fig. 1c, g). For details of the protein sequence, synthesis, and characterization, see ref. 1. While DHR10-mica binding proteins with different numbers of repeats were explored in our previous research1,24, we chose the 18-repeat version for this study because its aspect ratio was sufficiently large to ensure a significant entropic force for co-alignment while small enough for the proteins to be relatively rigid, thus approximating hard rods.

Fig. 1: Protein monomer design lattice matched to the mica surface and assembly outcome.
Fig. 1: Protein monomer design lattice matched to the mica surface and assembly outcome.

a The Rosetta design model of a DHR10-mica18 protein nanorod with dimensions 3.6 nm × 20 nm adsorbed on mica surface. The protein consists of 18 tandem repeat units shown in green with alpha-helices rendered as cylinders. An aluminosilicate layer of the mica substrate is shown with the K+ sublattice (shown as purple spheres). Scale bar is 5 nm. b Side view of the protein-mica interface showing negatively charged glutamate side chains (green and red sticks, respectively) extending from the protein with a periodicity that forms a 2-to-1 lattice match with the mica surface. c, d AFM images of the final assembly states of DHR-mica18 on f-mica and m-mica in 100 mM and e, f 3 M KCl, respectively. Scale bars are 100 nm. The FFT is shown in the inset. The FFT scale bars are 0.5 nm-1. Note that the observed phases are observed both during in situ imaging and after extraction from solution. In addition, the same orientation of the rods in (f) is observed everywhere across the surface. g–i Illustrations of 2D phases of hard rods possible at high concentrations in a three-fold potential: g 2D high-density disordered (HDD) phase, h nematic phase, and i smectic phase. Note that the smectic phase is not predicted when the rods are non-interacting21.

We chose two types of mica for the study — muscovite (m-) and fluorophlogopite (f-) mica —because previous results showed that, under conditions of high KCl concentration, both DHR10-mica proteins and others exhibit differences in their organization on the two mica types, despite having identical K+ sublattices1,3. In addition, the structural differences between m- and f-mica25,26,27, which are unrelated to the K+ sublattice, were shown to create distinct differences in the structure of the overlying hydration layers (Fig. 2 and Supplementary Figs. 2, 3)15, with both the first and second hydration layers above f-mica predicted and observed to exhibit hexagonal symmetry while the broken symmetry of the m-mica lattice leads to the emergence of a striped pattern in the second hydration layer (Supplementary Fig. 3)15. For experimental protocols, see Methods. Briefly, the protein stock solution was diluted to the desired concentration with the incubation buffer containing 20 mM Tris-HCl (pH 7) and 3 M KCl. Then, 10 µL of diluted protein solution was dropped onto freshly cleaved substrates at room temperature and the imaging was started immediately to capture the assembly process in situ, under constant ambient conditions.

Fig. 2: Structure of mica and overlying hydration layers.
Fig. 2: Structure of mica and overlying hydration layers.

a, b Atomic models showing face and edge views of two m-mica layers. The face (a) shows both the hexagonal array of cavities within a tetrahedrally coordinated aluminosilicate sheet in which K+ ions sit, as well as the hydroxyl groups that lie below the cavities and point alternately along one of two axes of the K+ sublattice, but not along the third ([100] axis). The edge (b) view shows that these hydroxyl groups lie in each successive layers below the surface. In addition, below the surface layer, a partially occupied layer of octahedrally coordinated Al3+ ions combines with the hydroxyls to break the three-fold symmetry and slightly distort the aluminosilicate tetrahedral sheet. In f-mica, the octahedrally coordinate layer is fully occupied by Mg2+ ions and the hydroxyl groups are replaced by F- atoms, rendering the structure three-fold symmetric (see Supplementary Fig. 2 for details). Purple: potassium, blue: silicon, cyan: aluminum, red: oxygen, white: hydrogen. A wedge of cyan in the tetrahedral layer represents the partial substitution of silicon with aluminum. c–f Oxygen density in the first and second hydration layers, as predicted by molecular dynamics simulations, reveals the hexagonal symmetry of both layers on m-mica and f-mica, and the emergence of a striped pattern in the second layer above m-mica, resulting from the broken symmetry18. See Supplementary Figs. 2 and 3 for further details, including three-dimensional AFM measurements validating the predictions. cf Reproduced with permission from ref. 15 (copyright American Chemical Society 2022).

In agreement with ref. 1, we find that, in 100 mM K+ aqueous solutions, the protein nanorods assemble into a disordered phase consisting of small domains of coaligned proteins oriented along the three principal axes of the underlying mica lattice, regardless of whether f- or m-mica is used (Fig. 1c, d). The 2D fast Fourier transform (FFT) of Fig. 1c-e all display three pairs of blurry high-intensity spots. These represent the short-range order of the co-aligned protein nanorods within small domains, but they lack long-range order. However, when the K+ concentration is increased to 3 M, DHR10-mica18 continues to form the three-fold disordered phase on f-mica (Fig. 1e), but it assembles into an ordered phase on m-mica in which all rods are coaligned along a single direction corresponding to the unique axis of m-mica and arranged in parallel rows everywhere across the m-mica substrate (Fig. 1f). Additionally, the corresponding 2D FFT shows two sets of condensed spots, representing the side-by-side arrangement of the co-aligned protein nanorods in a row and the single directional long-range order of the co-aligned nanorods across the rows1.

In the parlance of the liquid crystal literature, the observed disordered phase is known as a 2D high-density disordered (HDD) phase (Fig. 1g) and is predicted for a sufficiently high rod concentration in a 3-fold potential when the translational and rotational mobility are low22,23. As the mobility increases, theoretical treatments predict that the rods will align due to purely entropic forces, forming a 2D nematic phase21 (Fig. 1h), which we do not observe experimentally. Instead, the ordered phase observed on m-mica at 3 M K+ has smectic order, (Fig. 1i) which is not predicted for non-interacting rectangular rods in 2D, though the introduction of excluded volume interactions at the rod tips due to the addition of polymer tails or charges that produce electrostatic repulsion has been predicted to stabilize smectic order.

The above observations thus present a conundrum: if the rod-substrate potential is established by the interaction of the glutamate side chains with the K+ ions of the mica lattice, then why should the nanorods assemble into two distinct phases on m- and f-mica, which possess identical K+ sublattices, and, for the case of m-mica, why does smectic order emerge in a two-dimensional system, rather than only reaching nematic order? To answer these questions, we used HS-AFM4 to observe protein adsorption and assembly at the water-mica interface, follow the emergence of order, and quantify the degree of nematic and smectic order on both substrates.

In the case of m-mica in 3 M K+ (Fig. 3 and Supplementary Movie 1), individual proteins are rarely observable in the initial frames; rather DHR10-mica18 initially forms a 2D liquid phase in which the proteins have high in-plane mobility (Fig. 3a, t0 + 18.92 s). Gradually, the proteins become visible as small domains of two to four coaligned nanorods that are short-lived, often appearing for only a single frame (Fig. 3a, t0 + 75.68 s to t0 + 277.03 s), but, with time, become larger and longer lived (Fig. 3a, t0 + 277.03 s to t0 + 337.84 s) until a stable smectic phase emerges (Fig. 3a, t0 + 337.84 s to t0 + 940.54 s).

Fig. 3: In-situ high-speed AFM results and machine learning analysis to follow the assembly of protein nanorods on m-mica.
Fig. 3: In-situ high-speed AFM results and machine learning analysis to follow the assembly of protein nanorods on m-mica.

a HS-AFM images from Supplementary Movie 1 showing the translational motion of protein rods and their assembly on m-mica into a smectic phase. Scale bar equals 50 nm. b Machine-learning-based workflow to recognize the protein rods in the HS-AFM images in (a), where each arrow indicates the placement and alignment of a nanorod. c Number of recognized protein rods as a function of time. d, e Nematic and smectic order parameters as a function of time, respectively, for the observed rod assembly in (a). The discontinuity in the data lines in (ac) is due to the in situ HS-AFM losing track during the experiment. Source data are provided as a Source Data file.

To quantify the dynamics of assembly and the degree of order, we used a computational workflow where an ensemble of deep convolutional neural networks was employed to achieve semantic segmentation, which was then used as input to a conventional algorithm for rod recognition (Fig. 3b and Supplementary Figs. 4–6 for computational workflow details and ref. 28. for details of machine learning code). This approach was used to establish the total coverage (Fig. 3c), as well as the orientation and center of mass of each protein from each frame of the movies, which were in turn used to obtain the nematic (Fig. 3d) and smectic (Fig. 3e) order parameters, respectively (for detailed calculation of order parameters see Methods under “order parameters”). As the analysis shows, virtually every nanorod visible in any frame is oriented along a single direction. Thus, the nematic order rises rapidly and saturates. In contrast, the smectic order increases gradually, growing slowly at first and then rapidly transitioning to higher values before saturating as the surface becomes densely packed (Fig. 3e). This behavior mirrors that of the surface coverage (Fig. 3c) and shows that, as the domains become closely spaced, a percolation threshold is reached at which the probability of rod attachment increases rapidly and leads to high coverage, while the high degree of translational mobility enables the domains to align and reach high smectic order.

In the case of f-mica at 3 M K+ (Fig. 4 and Supplementary Movie 2), rods are already visible in the initial frames, both as individual rods and small domains, and unlike the case of m-mica, are oriented along all three K+ sublattice directions with roughly equal probability (Fig. 4a). Individual domains fluctuate in size but, on average, grow as individual rods attach (Supplementary Movie 2). Furthermore, unlike the m-mica case, no rapid transition in either coverage or order parameter is observed. Most importantly, the resulting HDD phase exhibits low values for the nematic order parameter, although this is expected to be the equilibrium phase at high surface coverage and mobility. Moreover, the smectic order parameter is always nearly zero.

Fig. 4: In-situ high-speed AFM results and machine learning analysis to follow the assembly of protein nanorods on f-mica.
Fig. 4: In-situ high-speed AFM results and machine learning analysis to follow the assembly of protein nanorods on f-mica.

a HS-AFM images from Supplementary Movie 2 showing the translational and rotational motion of protein rods and their assembly on f-mica along the three K+ sublattice directions into a high-density disordered phase. Scale bar equals 50 nm. b Machine learning-based workflow to recognize the protein rods in the HS-AFM images in (a), where each arrow indicates the placement and alignment of a nanorod. c Number of recognized protein rods as a function of time. d, e Nematic and smectic order parameters as a function of time, respectively, for the observed rod assembly in (a). Source data are provided as a Source Data file.

The above results show that, although the proteins are designed to bind through an electrostatic interaction between the carboxyl sidechains of the glutamates and the K+ sites of the mica lattice, the resulting liquid crystal phases on m- and f-mica exhibit stark differences even though the K+ sublattices are identical. Moreover, the proteins exhibit much higher mobility on m-mica than on f-mica. Consequently, on f-mica the proteins remain trapped in an HDD phase even though the equilibrium state is nematic, while on m-mica they exceed nematic order, instead attaining a high degree of smectic order.

A source for the contrasting mobility and distinct liquid crystal phases is the distinct structure imposed upon the overlying solution by the underlying mica lattice. Although the proteins’ direct interaction with the mica surface is through the three-fold symmetric K+ sublattice, they cannot avoid interacting with the surrounding solution. As in any colloidal system, solvent-exclusion forces arising from the Brownian motion of water can be expected to influence protein-protein and protein-surface interactions. In the case of m-mica, the broken symmetry of the near-surface hydration structure (Fig. 2c, e) could, in principle, impart a large enough bias in the orientational potential energy landscape to alter the alignment of the nanorods.

To investigate whether a change in the potential energy landscape due to the altered symmetry of the water structure in going from f- to m-mica can lead to the observed behavior of DHR10-mica18 as a consequence of purely colloidal forces, we performed 2D grand canonical Monte Carlo simulations of hard rods (i.e., non-overlapping high-aspect-ratio rectangles) freely depositing on a surface in two scenarios (Fig. 5): 1) the three lattice vectors that define the rod orientations have equal probability of occupancy — i.e., the potential is three-fold symmetric — to represent the case of f-mica (Fig. 5a, g–j), and 2) one of the three orientations is twice as energetically favorable compared to the others — i.e., the potential is quasi-two-fold — to represent m-mica (Fig. 5b, c–f). For each, we systematically varied the chemical potential, which controls the surface concentration of rods, and rod mobility, which determines the maximum distance a rod is allowed to move during a Monte Carlo step, selects between kinetically trapped states at low mobility and equilibrium states at high mobility, and captures the increasing mobility of the proteins with increasing salt concentration.

Fig. 5: Grand-canonical Monte Carlo simulations of hard rods on surface with and without bias.
Fig. 5: Grand-canonical Monte Carlo simulations of hard rods on surface with and without bias.

a, b The smectic order parameters for a collection of Monte Carlo simulations of hard rods with aspect ratio ℓ = 7 are presented in grids, where each box in the grid represents a single simulation defined by its chemical potential and rod mobility; the coloring indicates the value of the smectic order parameter. Separate grids are plotted for simulation collections where (a) all three rod orientations are equally favorable (“Unbiased”), and (b) where the horizontal rods are twice as energetically favorable as the other orientations (“2x Bias”). c–j Several select simulations are annotated with a snapshot of the final rod configuration; in the snapshots, rods of the same orientation share the same color. Source data are provided as a Source Data file.

At each Monte Carlo step, each rod is allowed an attempt to translate elsewhere in the simulation box where it does not overlap with any other rods. These translation attempts are followed by a fixed number of evaporation or deposition attempts, where new rods may only be deposited where they would not overlap with any existing rods. These evaporation and deposition moves also implicitly account for rod rotations: a rod with a different orientation may be deposited in the same location as a recently evaporated rod as long as there is space available (detailed simulation procedure is described in Methods).

The results show that when the potential is three-fold symmetric (Fig. 5a), a rod mobility of zero leads to a three-fold disordered phase regardless of the value of the chemical potential, consistent with previous findings for low mobility rods on a triangular lattice29. As the chemical potential increases across simulations of rods with nonzero mobility, a transient nematic phase emerges (Fig. 4i) before being replaced, at high chemical potential, by a phase composed of large, ordered domains. The smectic order parameter (see Methods for the detailed calculation of order parameters) is nearly zero for all conditions where all three rod orientations are equally favorable. These results are consistent with the experimental observations for f-mica, on which the protein mobility is too low for even small domains to reorient into alignment with their neighbors (Supplementary Movie 2).

In contrast, when one of the rod orientations is more energetically favorable than the other two even by just 2x (Fig. 5b), a clear smectic order emerges for all conditions investigated, provided the rods have adequate mobility and sufficiently high chemical potential, while a three-fold HDD phase is observed only at zero mobility and high chemical potential. The fact that we only observe the smectic phase when we apply an orientational bias to the model system suggests that the smectic phase observed for non-interacting rectangular rods is solely due to the emergence of a two-fold rod-surface potential on m-mica due to the structure of the interfacial water layers14,15,16. This result demonstrates that the smectic order observed in these systems is mediated by the molecular details of solution-surface interactions rather than rod-rod interactions15,20.

Overall, the simulation results are consistent with the experimental observations of protein nanorod assembly on m-mica and f-mica at low and high ion concentration, respectively (Fig. 1c–f), with one caveat: while the simulation model predicts a three-fold HDD phase at high chemical potential for m-mica (Fig. 5d), it is only observed when the rods have zero mobility, a condition that currently cannot be corroborated experimentally.



Source link