New York City’s IT staff is opening up the code behind its geocoding program, which tags street addresses with location-based attributes, giving developers a new chance to experiment with the service.
The city’s Department of Information Technology and Telecommunications released its “Geoclient” service under an open source Apache software license last week, a move that IT leaders hope will give programmers unique access to the tool. The software can identify everything from a location’s longitude and latitude to its city council district, opening up an array of development possibilities for developers outside city hall.
“When some of these startups or universities are doing analyses, they’re correlating and aggregating a lot of data,” Colin Reilly, the department’s director of geographic information systems, told StateScoop. “Instead of just mapping it as points, and doing heat maps and things like that, they can aggregate and summarize that data at a tax parcel level and gain greater insights into what’s happening in the city of New York.”
Reilly said that the city has been using a geocoding tool for 30 years now, but it was “originally mainframe based” and wasn’t “the easiest to connect to and access from modern programming languages.”
By 2013, his team had developed Geoclient as a Web-based interface for the service, and a year later, they released an API for the tool to the public. Reilly said the city’s increasing collaboration with civic-minded technologists in the aftermath of the ruinous Hurricane Sandy in 2012 convinced them that the public should get their hands on the geocoder, and it’s only increased in popularity since then.
“It’s a little too popular for us to keep pace with it,” Reilly said. “So we decided, why don’t we release the code and those heavy users, power users can then download that and run it on their own infrastructure.”
Matthew Lipper, the lead developer for Geoclient, said the move to open up the code has already been hailed by a variety of businesses using the tool. He notes that some startups focused on facilitating apartment searches were using Geoclient to “validate addresses and collect statistical information about a neighborhood” and now no longer have to depend on the city’s computing power.
“They depend on running however many hundreds of thousands of locations through the geocoder that we would have to sort of throttle, in making sure that everybody gets their fair share of resources,” Lipper said. “Now the real power users can move to the next level and run this same application in house.”
Lipper thinks the release of the code should encourage other businesses to start using it as well, as it provides easier access to a wealth of data collected by city agencies — such as the departments of Finance and City Planning — that is constantly updated by workers.
“It’s exposing internal identifiers that are used by different agencies for different purposes, gathering them all in one place and putting this official scale of what this identifier means,” Lipper said. “It allows the public to interact and interoperate with the city in the same way that the existing application itself is used by the different agencies to communicate with each other.”
But Lipper and Reilly feel making the code available to a wider community of developers also offers benefits for the department as well. With more eyes on their work, they believe they’ll have more insight into what they can improve.
“It’s on GitHub, so the more people that use it and like it and need it, the more people will be saying, ‘Hey, I was doing this and I discovered this bug,’” Lipper said. “They could develop a fix for the bug, which means that we don’t have to fix it.”
Lipper added that kind of input is especially valuable for the department, since normally “the nature of our organization is such that we can’t just let anybody come and do anything to the code.”
Reilly hopes that the code’s release signals to the public the city’s commitment to encouraging innovation, and envisions Geoclient working better for all users now that it’s completely out in the open.
“There’s software feedback and there’s data feedback that could come through and that enriches the data and benefits everyone,” Reilly said.