How 1 billion data points could improve career outcomes for California students

California's Cradle-to-Career office recently collected its first data installment, a crucial piece to building a dashboard that serves California students, families, teachers and policymakers.
3d data charts
(Getty Images)

In late October, California’s Cradle-to-Career office — which manages a statewide education and workforce data project — collected its first data installment with more than 1 billion data points across 200 different categories, including K-12, college admissions, post-college employment opportunities. Officials told StateScoop it’s a vital step towards building the first longitudinal data system in the state.

Integrating over more than 1 billion points of historically siloed data, some of which goes back 10 years, into an accessible dashboard may seem daunting, but it’s just another stage of the agency’s five-year plan, according to Mary Ann Bates, executive director of Cradle-to-Career. Bates said the agency will use Tamr, a software-integration tool that uses artificial intelligence, to link data points at the level of individual people. 

“This milestone is the culmination of many years of work,” said Bates, who was appointed to lead the department in 2021 after previously serving as a senior fellow at the White House Office of Management and Budget, evaluating datasets at the federal level. 

“We can now take the information from each of our data partners, stitch it together and link it,” she continued. “Then in early next year, we’ll fold in earnings information from the Employment Development Department, which will let us answer some important questions about living wages and how young people are doing as they enter the labor market.”


Before Cradle-to-Career Data System was established in 2019 under Gov. Gavin Newsom, information on the state’s early education, K-12 schools, colleges, public services, and employment had been separate and disconnected, making it difficult to explore the factors between early childhood through adulthood that shape outcomes.

“Think about some of the questions that are hard to answer right now: what happens to young people in my community, in my county? Where do they go after high school? Do they go to higher ed? Do they go straight into the labor market? If they go to community college, do they transfer? What kinds of programs are preparing people for living wage jobs?” Bates listed. “Those are the kinds of questions that people want to be able to answer.

The user-friendly system she says the agency hopes to build will include resources focused on early learning through K-12 and higher education, as well as on the financial aid and social services to help students and families plan for the future.

“If this data system isn’t useful to students, to families, to the guidance counselor who has tried to help students in high school–if it’s not useful for all of them, we won’t have fully succeeded in our goal,” said Bates.

Researchers and lawmakers will also be able to access detailed information on education and career outcomes, workforce trends, which will be further broken down by race, gender, ability, and geography to address areas of strength and needed improvement.


Bates said collected data is validated by fellow state agencies and then de-identified, or scrubbed of personally identifying markers, before added to the interactive portal.

“That information is used only for the linking purposes, then it’s separated out,” Bates said. “So the dashboards, the query builders that we’ll be sharing with the public have a de-identified data set that doesn’t contain the personal information of the individuals.”

Forty states have some type of statewide longitudinal data system that connects data between early learning, K-12, postsecondary and workforce outcomes, according to the Education Commission of the States, an interstate agency that tracks educational policy — though only 33 of those feature a public dashboard.

For now, California’s education data comes from publicly funded K-12 schools and universities, but Bates said the agency’s five-year plan seeks to integrate data from early learning and care facilities, independent and private universities, private workforce and social services organizations.

Sophia Fox-Sowell

Written by Sophia Fox-Sowell

Sophia Fox-Sowell reports on artificial intelligence, cybersecurity and government regulation for StateScoop. She was previously a multimedia producer for CNET, where her coverage focused on private sector innovation in food production, climate change and space through podcasts and video content. She earned her bachelor’s in anthropology at Wagner College and master’s in media innovation from Northeastern University.

Latest Podcasts