data due diligence
According to Mann, generative AI has special implications for data security in particular. “If you tell an AI to suck up all this data and reuse it, how do you achieve data loss prevention?”
In fact, for security, compliance, and efficiency reasons, CIOs want to carefully manage which data-generating AIs have access. For example, search augmentation generation (RAG) has emerged as an important technology that makes LLM convenient for working with its own data, but does not require all data to be fed into it. This means not only the cost of preparing data sets that are unnecessarily large (requiring expertise that is not yet commonplace and requiring high salaries), but also what to teach the model. Masu. If you enter your entire Slack or Teams history, you're likely to get a response like “I'll work on that tomorrow.” While this is perfectly appropriate for human employees, it is not what you would expect from a generative AI system.
AI tools like Copilot can address flaws in an organization's approach to information management, such as structuring data and metadata, information architecture, organizing and understanding what's out there and how loose much of it is. alerts Christian Buckley, Microsoft MVP and Partner Management Director at Rencore. Organizations do both privilege management and data cleanup.
As the cost of data storage has decreased, many organizations are storing data that is no longer needed or cleaning up data that is outdated or no longer useful after a migration or reorganization. “People don't go back and declutter because there's no cost to them, other than lower risk profile and search performance,” says Buckley. He warns that if you introduce Gen AI capabilities without considering data hygiene, people will become disillusioned if you haven't done the up-front work to achieve optimal performance.
The same problem was evident when Microsoft launched Delve, and before that when FAST integration brought powerful search to SharePoint in 2010. “When we started seeing search actually working within SharePoint, people started complaining that it wasn't working properly,” he says. . “But it was. This means we're exposing a lack of governance around data. We're hearing voices saying, 'All my privileges are being violated.' No, it is surfacing where the hole is. Wouldn't you like to have more powerful search capabilities within your data using AI, but you don't know how to organize that data?
Another problem is that metadata tagging and sensitivity labels aren't applied correctly to the data, so Gen AI tools and users don't see the information they should include.