Data access management will become vital as large language models (LLMs) become more integrated into companies' systems, which could present opportunities for IT providers.
During the recent Global Technology Industry Association (GTIA) ANZ Community Forum, Varonis sales engineer Alexander Anketell highlighted the risks that LLM AI tools can present without proper file management.
“With Copilot, Gemini and most LLMs that we're talking about today, we're inheriting user permissions every time we turn on Copilot for business or we add ChatGPT integrated with SharePoint Online and OneDrive,” he said.
This is by design as LLMs are often sold as assistants that can draw information from a selection of data with instructions in natural language and little visibility into the process.
Anketell said this becomes a problem with historical documents that may not have the appropriate permissions.
“I see in my role a lot of SharePoint sites or folders or Google Drives open to every user in the organisation, either through a collaboration link, shared to everyone except external users or maybe it's somebody who's created the direct permission and has said, 'Yeah, why not all users?',” he said.
To access that file or folder, a user would need a link or to know the exact sub-folder where the file was held, but that is not a barrier for LLMs.
Anketell offered the hypothetical of someone asking an LLM to make a profit report for a department, the LLM then drawing from a sensitive document from human resources that has company-wide permissions, and the report including everyone’s salaries.
His advice is to take a least privilege approach, a term that refers to giving every account as little access as possible, but he concedes that's nearly always a difficult sell.
“I know this has always been a fight that a lot of MSPs have had with their customers for a very long time, and even enterprises have failed to adopt least privilege in data," he said.
"They do it for networks and application access, but they get scared to interrupt business productivity by implementing least privilege (for data)."
Anketell outlined how even seemingly innocuous files can become sensitive when processed by an LLM into something novel, a result he referred to as “net new data”.
His example was two files, one with customer names and addresses and another with customer names and date of births, neither of which are subject to privacy legislation individually.
“But then you go to Copilot and you say, ‘Hey, I want to know everybody's address and date of birth.’ It will go and grab those two files, put them together, and now you've got this fantastic net new file … but that's now a sensitive file that should be protected under the Australian Privacy Act,” he said.
While a least privilege approach is the ideal for managing these risks, Anketell said sensitivity labelling was also important, but again acknowledged it was often easier in theory than practice.
But, he noted, that does mean the chance for a new service MSPs can offer.
“Sensitivity labelling is really quite difficult for small businesses, but it's the time [now] to start talking to them about it," he told the audience.
"Now, in the next few years, it's going to become standard practice [and] I think labelling will start trickling down to the SMEs and MSPs."
Anketell added that organisations should start a process of going through old data checking permissions and, where appropriate, erasing files that are no longer needed.
He said it’s not a one and done process, which may be another opportunity for MSPs to realise some regular revenue.
“We need to actually do audits," he said.
"In enterprise, they've got tools to do that, but with SMBs, that's probably going to be some form of billable work that you need to implement with your customers to help them understand that risk,” he said.
“It is dynamic. You could do this every quarter, and you'll see two different stories. You could go and do a complete cleanup of everything and, in three months time, it'll be a mess again. It is an ongoing project for your customers.”
Anketell said the risk that LLMs present for data privacy is not likely to find a quick and easy answer soon and that the way organisations store, organise and label files will likely be changed for good.
“The time for having a big fat file server, or SharePoint with a million folders and sub folders and giving every user access to it, has gone,” he said.
"We're never going to go back to where there's not AI tools and chat bots and things like that. It's going to be sticking around, so part of that is embracing the technology itself where practical ... but we need to understand the implications."