A governance strategy is essential to ensure responsible use of machine learning.
ML platform providers also typically build governance frameworks as part of their efforts to develop a mature ML operations (MLOps) lifecycle. MLOps applies DevOps principles and techniques to deliver continuously improving ML-powered applications to production.
To achieve a mature lifecycle, companies must define:
- A data cycle that guarantees data quality.
- model cycle. An ML model is trained on that data.
- A process that links two cycles together.
- The process of linking the data and model cycle to the final application development cycle.
Companies should also define governance guidelines and standards to guide the MLOps lifecycle to completion. These are necessary to understand and prevent potential risks of uncontrolled lifecycles.
MLOps need to manage data, models and other processes
ML development teams can implement governance strategies at each stage of the entire lifecycle. The area of data cycle governance includes data sources and data sets.
A data source must meet certain criteria before it can be used by an enterprise. These may include the manner in which the source collected the data (e.g., legal and ethical) and what license terms, if applicable, are acceptable.
A data set must meet certain criteria before being added as a training set. These include criteria for what must be present, such as the provenance of when, where, and how the data was collected, and what can be absent, such as personally identifiable information (PII). There are cases. Criteria should also include criteria for overall data quality.
Areas of model cycle governance include testing and data usage by the teams working on the application.
A model must pass testing before it can be integrated into an application. This process may include tests for correctness and precision of answers, as well as tests such as biased output.
Human workers should be told what is allowed and what is explicitly forbidden in both data inputs and model outputs. For example, PII is often prohibited in both.
Throughout the cycle, policies should determine where and how version control and documentation are used. This includes the inputs and outputs for each stage of the lifecycle, how the team retrieves and prepares datasets for the ML model, how the ML model behaves before and after each training run, and how these models are embedded. including how ML-powered applications behaved before and after
MLOps governance is not a one-time thing
To implement all these processes correctly and sustainably over time, IT teams need to formally define a framework as a template. Each new ML development project should include a templated process and resources. All staff members involved should have a high level of familiarity with the entire MLOps lifecycle and this governance framework.
There are multiple ways to achieve such a framework. Organizations can either:
- Develop these policies and processes from scratch.
- Copy governance models implemented elsewhere.
- Purchase software that embodies governance (for example, for off-the-shelf workflows) and customize it accordingly.
Unmanaged ML development poses big risks
Lack of governance in MLOps poses risks, mostly in the form of functional issues where the model ultimately does not do what it is supposed to do. There may also be reputational risks, such as loss of trust, and legal sanctions related to improper use of data, use of contaminated data, and implementation of unacceptably biased applications.
Functional issues include:
- Defects in datasets that may contain intentionally “tainted” data are less likely to be discovered.
- The effects of bad data are no longer traceable and irreversible.
- The model will eventually produce inaccurate or biased results, making issues difficult to track and fix.
Functional problems therefore slow down and make the entire process of developing functional, legally compliant software slow and unreliable.
If the output of an ML application leads to legal action against the developer, the company could be prosecuted or sued for illegal use of biased output or inappropriate use of protected personal information. There is a nature. In such cases, an audit of that development may be required. Similarly, audits may be required to prove compliance with applicable laws or company policies. An available audit trail equates to documentation of all relevant activities. Without strong governance, the trail can be incomplete and inconsistent.
Ultimately, a well-constructed and well-trained MLOps lifecycle can help organizations implement and embody good governance practices, fulfilling the goal of speeding up the creation of functional, compliant software. It can be achieved.
