Advertisement

Data infrastructure a major challenge for AI in California, say officials

As the role of data in government grows, agencies are searching for ways to manage the sprawl, California officials said at a recent event.
Listen to this article
0:00
Learn more. This feature uses an automated voice, which may result in occasional errors in pronunciation, tone, or sentiment.
California capitol building
(Getty Images)

The role of the chief data officer has expanded in recent years to include tasks such as improving data infrastructure, facilitating data sharing and preparing data for AI consumption, state officials said at a technology conference in downtown Sacramento Thursday.

At the event hosted by the California Center for Government Innovation, California IT officials from health, labor and transportation departments also stressed that chief data officers need to work more collaboratively to prepare the state for artificial intelligence, strengthening data infrastructure and bridging the state’s data silos.

“[California] is possibly one of the largest and most over engineering government structures for the integration effort that we’ve seen,” Mary Ann Bates, executive director of the California Cradle-to-Career Data System, said in a panel discussion. “So what this means is that there’s a formal structure for making decisions about what is this data system for.”

Bates pointed to Cradle-to-Career, the state’s longitudinal data system, as an example of how California’s complex data governance policies can limit progress. The program plans to integrate over more than 1 billion points of siloed data — including early data on education, public services and employment — to explore the factors between early childhood through adulthood that shape outcomes.

Advertisement

Under California’s education code, the state’s education data is managed by a 21-person governing board with members ranging from superintendents to state representatives, creating a formal structure for making decisions about how the Cradle-to-Career data system operates.

“Naturally, there are different incentives right between the data providers who are responsible for the stewardship of their data and the end users of the data system,” Bates said. “It’s so critical to not be intimidated by the concept of a governance program and to be able to start small, even for the teams that don’t have dedicated data individuals.”

Jason Lally, California’s chief data officer, said that a good first step to tackling statewide data projects like Cradle-to-Career is for each agency to agree on definitions, including “data collection,” “privacy” and “security.”

“The reality of having shared definitions is, unfortunately, after taking data publicly shared that we sometimes see a disconnect in the way that data points are defined, [which] adds a layer of work that takes us back to step one of governance,” Lally said. “Definition sharing is powerful and adds value to the data, as opposed to thousands of data sets, but no one can use them because we don’t have [shared] definitions.”

In California, each state agency has different privacy and security policies that dictate what type of data can be collected and shared, officials said. The state’s Department of Health and Human Services, for example, collects patient medical history, financial information and residential addresses, and therefore adheres to stricter privacy policies than some other agencies, to protect that sensitive data.

Advertisement

Lally recommended state agencies create data-sharing agreements to give each office the same foundation and expectations when it comes to data collection and management.

“As a recipient of those agreements, in my role as chief data officer, it forces you to have some conversations. You’re not arguing on the label. You’re stuck. You’re actually talking about power, what’s it going to look like? Who’s going to have access to how are you going to have really, really important things to do, to share the data?” Lally said, recalling his own experience in reviewing data sharing agreements. “And in the process, you can actually start to see the real problems we need to solve.”

Evolution of the chief data officer

The Thursday event echoed findings from the Data Foundation’s fifth annual survey of federal chief data officers, which found that chief data officers’ roles have expanded since the Foundations for Evidence-Based Policymaking Act of 2018. That federal law mandated that each executive agency have a chief data officer to manage the growing pools of data.

“There is more data that’s constantly being applied to the systems, but actually using the data that’s already on the systems is just as important,” Nick Hart, one of the report’s authors, told StateScoop in a recent interview. “So I think it’s this recognition that we’re seeing at all levels of government, really everywhere in society, that data is valuable as an asset, and we shouldn’t be squandering that, particularly if we, if we’re collecting it from the American public, we should actually be using it.”

Advertisement

Hart said generative AI tools that California and other states are testing produce massive amounts of data, expanding the amount of data available to agencies. States might not have the appropriate infrastructure to manage that, he said.

According to the Beeck Center for Social Impact and Innovation, a think tank at Georgetown University, 35 states have a chief data officer, an increase from two years ago. However, the center found that even states with the role often fail to allocate significant funds for long-term data-management initiatives and infrastructure.

The Data Foundation report recommended that as more state government organizations adopt generative AI tools, chief data officers should work more closely with chief information officers, chief information security officers and chief AI officers to manage and protect citizen data.

“Data continues to be so foundational for [generative] AI and for adding efficiencies to government operations,” Adita Karkera, report co-author, told StateScoop. “The CDO’s role will only continue to grow because of the overarching importance of data and its impact on emerging technologies.”

Latest Podcasts