Advertisement

Data classification key to unlocking AI, says North Carolina’s privacy chief

Martha Wewer, North Carolina's chief privacy officer, said the stakes of data classification are only growing as AI becomes more integrated in government.
Listen to this article
0:00
Learn more. This feature uses an automated voice, which may result in occasional errors in pronunciation, tone, or sentiment.
(Getty Images)

As North Carolina readies its data to work with artificial intelligence, the state’s chief privacy officer said properly identifying that data is the critical first step to the state realizing AI’s full potential.

Martha Wewer, who has held North Carolina’s chief privacy officer position since last May, following the departure of Cherie Givens last February, said her office is working on a data classification project that will help the state better identify its data, aiding AI efforts. Data classification or identification, which she described as “knowing exactly what data you have, where it lives and how sensitive it is,” is important, she said, as it enables privacy and security controls, including one of the biggest risks to data privacy.

“I always joke with folks that no one is as passionate about data classification as I am, and I think that really, truly is the largest risk right now as we move towards more artificial intelligence in everything that we do,” she said in an interview.

She said that while the state is doing a good job of protecting its data, proper classification will help the state and its agencies to understand which data is usable with an AI tool, which data might need to be aggregated or anonymized and which datasets have high enough quality that the state can trust their use.

Advertisement

Wewer also noted that the project is important to state leadership: North Carolina Gov. Josh Stein last fall signed an executive order creating a statewide artificial intelligence framework, an AI leadership council, an AI accelerator at the Department of Information Technology and oversight teams in each agency.

“Our governor has said that he wants to build safe and trustworthy AI use within the state, and being able to identify our sensitive data and being able to label it so that we don’t have any potential data leakage is really my highest priority, and we have made a lot of steps to get to the point where we can start doing that effectively now,” Wewer said. “We already do a great job of it — agencies do work really hard and protect their data, and this will make it a lot easier for agencies.”

Under Stein and Chief Information Officer Teena Piccione, Wewer said data classification starts with identification, or creating an inventory. Wewer said her office classifies each piece of data, as public, internal, confidential, highly sensitive, etc. She said it’s what allows the state to create rules that specifically control data, such as: “This dataset is okay for this AI use case, this other dataset must never leave our environment.”

Data classification, which is often done alongside a privacy threshold analysis, a questionnaire used to spot personally identifiable information, drives the formation of various protections: access controls, encryption, retention periods, logging and vendor requirements. Sensitive information not properly labeled can be exposed to the public or unwittingly used to train AI models.

Wewer said the stakes are only growing as AI becomes more integrated with government operations.

Advertisement

“I think that there’s a big hunger for folks to understand how privacy and data protection fits in with artificial intelligence,” she said. “You know, that’s always the elephant in the room, and I think that there are people who particularly in the privacy culture that we have at the state want to use these artificial intelligence tools, but they’re worried about data and data privacy, which I love. I don’t like that they’re worried, but I love that they’re being thoughtful about it.”

She added her office is undertaking this work with I-Sah Hsieh, North Carolina’s deputy secretary for artificial intelligence and policy, due to the natural overlaps between privacy, AI and even cybersecurity governance frameworks.

“We didn’t have a lot of guidelines when people started talking about privacy, so we sort of had to come up with our own frameworks and our own governance, aside from the regulatory structures like HIPAA,” Wewer said. “Data that we collect at the state has to be governed a certain way, and it’s that same type of governance structure that you have for artificial intelligence. So really, that’s something strategically that I-Sah and I — and even [Chief Information Security Officer] Bernice Russell-Bond, from from a cybersecurity perspective — we have all been working together on.”

Latest Podcasts