Rio Tinto Adopts MLOps to Expand Use of Machine Learning – Cloud – Software

Machine Learning


Rio Tinto is working to abstract behind-the-scenes complexity from teams deploying new machine learning models across their operations.

Rio Tinto adopts MLOps to expand use of machine learning


(From left) Marcus Rosen of Rio Tinto and Romina Sharifpour of AWS.

Marcus Rosen, principal of machine learning operations, said at AWS Summit Sydney earlier this year that the company is adopting MLOps to standardize the delivery and deployment path for new machine learning models.

“My team's core mission is to optimize the delivery experience for data science teams to make their jobs easier and less stressed by infrastructure and security issues,” Rosen said.

“We are working directly with these teams, as well as our cybersecurity team, to create a secure and flexible delivery environment.”

Rosen said Rio Tinto made a strategic move seven years ago to build a centralized “dedicated machine learning capability to help deliver machine learning solutions across business units and product groups.”

Data scientists, and “general” data users, are primarily based in Brisbane for the mining company’s aluminium operations, Perth for its iron ore operations, Singapore for its commercial operations, and Montreal, Canada.

“We have also built a dedicated team in India over the past few years working on productionized machine learning solutions,” Rosen said.

Supporting business and product data scientists, engineers, and “general” users are the “Amazon SageMaker tools and machine learning environments.”

Miners will use both SageMaker Studio and SageMaker Canvas, the latter a so-called “no-code” tool that enables “citizen data scientists” and non-technical people to build machine learning models.

“As we scale machine learning across Rio Tinto, we need a more standardized approach to serving and deploying models – machine learning operations, or MLOps,” Rosen said.

SageMaker treats MLOps as a way to “automate and standardize processes across the entire machine learning lifecycle.”

“Our view is that data science teams don't need to spend real time on infrastructure issues,” he said.

“it is, [SageMaker] Users can just open up the notebook and install what they need and so on, and we're doing a lot of work behind the scenes to enable our teams to do that.”

Part of the behind-the-scenes work is aimed at improving access to the datasets, both across the organization and from the internet.

“You can't build a good machine learning model without good data,” he said.

“[But] Accessing data can be difficult. Today, there are multiple data lakes… [and] Another challenge is around networking: all our production accounts are fully air-gapped and have no internet access by default, and any internet access or real access outside of the account must be whitelisted through a centralized firewall.

“Some teams need access to internet-based datasets. For example, satellite datasets are too large to ingest into a data lake, so we need to enable that access in a timely manner.”

Rosen said a “multi-cloud data lakehouse platform” is in the works, eventually allowing teams to “self-service expose their own datasets in a data mesh-type approach.”

In a data mesh, a dataset is treated as a “product” that is owned and managed by a single team, and access to the dataset is also monitored.

Rosen said miners make heavy use of AWS PrivateLinks to access data stored in the AWS cloud environment.

We're also working to make the security and data access aspects easier and more automated for teams who might need to make firewall changes or similar to access data.

“Finally, we've been working very hard behind the scenes to pre-configure machine learning environments on SageMaker Studio and Canvas,” Rosen said.

Rosen cited three operational areas where machine learning has been effective.

One area is predictive maintenance of mining companies' private rail networks used to transport ore from mines to ports.

“Any disruption to the line can cost a lot of money in fines, so [an ML] “The model runs on a sliding window basis and can predict up to about seven weeks in advance whether a particular section of track will need maintenance that could cause delays,” Rosen said.

“With that information in hand, the planning team can schedule that portion of the course or make corrections before any issues arise.”

Other ML models are being used in health and safety: for example, in Canada, a model predicts the likelihood of a water leak occurring at an industrial smelter.

“Water dripping into an industrial smelter can cause a buildup of highly explosive hydrogen gas,” he said, adding that it could lead to loss of life, damage to equipment and the closure of the plant.

“The model runs hourly and generates a risk assessment that feeds into the plant's risk management process.”

A third use case for machine learning is habitat management.

“As a major mining company, we take our environmental responsibilities at our mine sites very seriously,” Rosen said.

“We have been using machine learning to recognise and manage animal habitats around these locations.

“[This] That information will feed into the mine planning process and help avoid disturbance of habitats.”

Ry Crozier attended the AWS Summit Sydney as a guest of AWS.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *