Pushing for open data that makes a difference, Sunlight Foundation publishes tactical guide
September 19, 2017
A lot of cities publish data, but not in ways that matter to the people, the group says.
Syracuse hopes University of Chicago fellows can help the city develop a model to predict which parts of its water infrastructure are most likely to fail.
Alex Koma is a freelance reporter based in Arlington, Va.
Previously, Koma was a staff reporter for StateScoop covering state and l...
Syracuse, New York, is trying to use data analytics to cut down on the hundreds of water main breaks the city deals with each year, teaming up with a group of student researchers to study why the pipes fail and how they can proactively repair them.
The city is working with the University of Chicago’s “Eric and Wendy Schmidt Data Science for Social Good” fellowship program this summer on the issue, with the ultimate goal of developing a model that can predict which water mains are most likely to rupture.
“We’re at the point where the majority of water mains in the city are a century old or even older, so they’re probably past their useful life at this point,” Sam Edelstein, the city’s chief data officer, told StateScoop. “So we’re trying to think about, ‘How do you spend the money most effectively to replace them in a way that will be productive for the future?’”
With Syracuse dealing with “anywhere from 250 to 350 water main breaks each year in the last decade,” Edelstein said it’s no surprise that Mayor Stephanie Miner has made revamping the city’s infrastructure “a big priority for her administration.” Accordingly, she directed Edelstein and the rest of the city’s “Innovation Team” to start thinking about creative ways to tackle the problem.
That led Edelstein and the team to meet with Cincinnati’s performance management and data analytics group, known as “CincyStat,” several months ago to see how the Ohio city was using data to confront its own problems with urban blight. Edelstein talked with Chad Kenney, then the city’s chief performance officer, and he said those conversations “got my mind going” about how to apply a similar strategy to the infrastructure issue.
Kenney ended up joining the data science fellowship’s leadership team, and Edelstein soon applied for the chance to partner with the fellows on a project.
“One of the reasons we like this issue is, if Syracuse has this issue, many cities probably have this issue,” Kenney said. “If we can solve it for them, then this is something we could scale to other cities and hopefully help them solve the same problems.”
Indeed, the fellows chose to work with Syracuse as one of their 12 collaborations this summer, and Kenney even agreed to serve as project manager for the partnership.
To start off, Edelstein said he worked with the fellows closely on finding what data the city already maintains that could help them examine why water mains fail, and what information they could start pulling in to add more context.
“We tried to come up with some plans for how we might do some data entry to fill out that missing data, and then we also really worked hard on what the scope of the project should be,” Edelstein said. “We’d love for them to be able to say, ‘A water main break is going to happen on the 200 block of East Washington Street next week, so you better go and fix it,’ but we also knew that probably wasn’t possible, but we also wanted more than just some descriptive statistics and exploratory analysis on what are the trends of water main breaks, we wanted to be predictive. So what we agreed to is that they were going to give us a list of which are the riskiest water mains in the city based on all of the previous information about water mains that we had.”
Edelstein also worked to connect the researchers with the city’s water department and GIS specialists, so they could start sharing information about how Syracuse currently manages its water infrastructure and shape their efforts.
“There’s expertise in the water department, the roads, they all understand what factors lead to water main breaks and how to handle the infrastructure, so what we envision is taking their tribal knowledge and incorporating it into one, mainstream data-driven process,” said Avishek Kumar, one of the fellows working on the project.
Edelstein and his team made a trip to Chicago to meet with the fellows in person, but Kenney wanted the researchers to get a firsthand look at Syracuse’s infrastructure for themselves. In July, he led the group on a two-day visit to the city, meeting the mayor, and even going on ride alongs with water and transportation workers to see the pipes and roads they’d be studying up close.
But they also took time to examine the city’s data setup and made some surprising discoveries.
“They had probably several decades worth of engineering books where the engineers would actually draw out where they had installed pipes under the street, the size of it and materials, and lot of that data had not yet been digitized,” Kenney said. “So there was a piece of the trip where we were literally going through the engineering books and pulling out data from the books so we can have some of that information in building the model.”
With that data in hand, Kumar said the fellows have started to pinpoint which factors could contribute to a water main failing, with plans to study them in more detail.
“We want to look at, say, the age of the pipes replaced, the materials of the pipes, the diameter and location,” Kumar said. “From there, we can see in in certain parts of the city, the pipes are too old and need to be replaced, or find other factors and say different parts of the city where traffic would play a role.”
By the end of the summer, Kumar hopes they can give the city a predictive model to start testing and drive some “proactive maintenance” of the pipes.
“Syracuse being sort of a middle-sized city, it’s actually very easy to deploy something on the ground and change it as needed to optimize it,” Kumar said. “From there, as we add more data and understand better what leads to main breaks, we can optimize our model.”
Edelstein is eager to start experimenting with the model, since he said it could help Syracuse either “replace water mains when we have the dollars to do that, or attack all the different things around a water main break to make sure the least number of people are affected for the least amount of time by it.”
But even if the researchers can’t manage to come up with a workable model, he thinks their work has already helped the city sharpen its own data management practices.
“This project give us a good sense of what we need to do to make sure our data is usable,” Edelstein said. “This is something that can help us start telling the story about why using data in a city is important.”
Yet the fellows are optimistic that they’ll emerge with something that can make a big difference for any city with similar infrastructure concerns. Kumar notes that all the algorithms and code they’re working with is open source, “so any interested party can look at it and tweak it if they’d like,” and Kenney foresees other cities seizing on their efforts in the future.
“The foundation is definitely transferable to other cities, and then it just comes down to a willingness and open mindedness to try operating in this new way using data,” Kenney said.