Digging Up the Dirt With Mapping of Underground Chicago

By Marty Graham, Contributor

When construction excavation goes deep on the University of Illinois Chicago campus, a team of data scientists, mapmakers, and engineers show up to look into the hole so regularly that they have their own hardhats and safety gear.

Why the frequent visits? Because excavations offer a rare chance for UIC professor Isabel Cruz and her team to see underground and into their project: a data-informed map of buried infrastructure on the 160-year old campus. The National Science Foundation-funded project is being created with tools and experts from disparate disciplines. Mapping, analysis of historic data, and machine learning are used to draw cohesive and comprehensive maps of the buried water and sewer pipes, conduits, wires, and other unknowns that crisscross the 311-acre campus.

The need to figure out where this infrastructure is located couldn’t be clearer: The Common Ground Alliance estimates that someone accidentally hits buried utilities every 100 seconds in the U.S., causing billions of dollars of damage and service interruptions every year. “You know how we see those terrible gas explosions in the news because someone pierced a pipe—it reminds me this work is worthwhile,” Cruz said.

Besides safety, there is a solid economic argument for the effort: When municipal construction in older cities like Chicago uncover unexpected and unidentified pipes or buried infrastructure, the project can halt for weeks—with the hole open—while project workers figure out what they found and what to do about it.

For older cities, there can be more than 100 years of infrastructure information that just isn’t there. Parts of New York City, for instance, have an entire abandoned city underneath—brick sewer tunnels and desolate train platforms.

Lost Maps, Lost Time, Lost Money

Cruz was working on another project with city of Chicago utilities managers when she learned about the problems pipe discovery could cause. “They told me they don’t even know where some of the pipes are, and when they find a pipe they can’t identify, everything stops for three weeks,” she explained.

“We want to understand what is going above the ground so we can understand what is going on underground.”

— Isabel Cruz, Professor at University of Illinois Chicago

There’s no practical way to identify exactly where abandoned systems and those still in use are located. Does the sewer main run down the exact middle of the street, or did widening or shifting it change its center?

“Obviously, we cannot open holes to see where the pipes are,” Cruz said. “We want to understand what is going above the ground so we can understand what is going on underground.”

Cruz is a data scientist, one of several disciplines collaborating on the project. Her team includes environmental and civil engineers—an expert in algorithms and two with geospatial skills—who know how to work with queries and massive databases. They have a security expert, and there’s a team member who is interviewing engineers and workers about unusual construction problems, like what happens when pipes from one city enter into another.

Work on infrastructure mapping is a growing field, with people trying sonar and 3D visualization approaches. Each approach has its strengths and weaknesses, and Cruz sees the project as complementary to other methods.

The idea appealed to the National Science Foundation, too. In 2016, the foundation funded a three-year project, noting that “cities are cyber-physical systems on a grand scale, and developing precise knowledge of their infrastructure is critical to building a foundation for the future smart city.”

The National Academy of Sciences estimates that just 35 percent of municipal utility records are complete and up to date, and only a third of those show how the infrastructure was actually built. The academy notes that the assessment was made 45 years ago, and things have not gotten better. Cities that try to modernize more often than not learn they don’t know for sure exactly where the existing or retired pipes, wires, and lines are located.

Finding that infrastructure through sonar is exceedingly expensive and riddled with flaws. For instance, sonar does not always detect plastic piping, and can be distorted by elements in the soil and pipes. The university team’s approach of starting with what is known and then crunching data is both affordable and proving fairly accurate when the team gets to “ground-truthing,” or in other words, going out to see if the digital work resembles what’s on the ground.

“Much of what we do is by inference,” Cruz said. “We know that buildings were built a certain way and that, for example, the main runs up the middle of the street. Because of that, it’s important we have ground-truthing, otherwise we’ll search for things that aren’t there.”

Data Matters

Everything comes down to the data, Cruz continued, particularly with the machine learning portion. Data scientists spent long hours cleaning and organizing an array of historical, sketch, map, and engineering data.

The team started with good news: The University of Illinois Chicago had already converted its legacy paper maps to digital layers of data—the single largest mapping task of entering data. The next step involved making sure that information was coherent—that’s where the team received bad news: The data was a mess of overlapping and duplicate information, filled with startling holes in information.

“We cleaned the text data—there were typos everywhere, like the same business name spelled slightly differently so it appeared twice. We had to go through and reconcile different records from different databases,” Cruz said.

The first layer for the dynamic, data-filled map begins with infrastructure information, which the team had, as there were plenty of engineering drawings, and views of the streets and buildings were available.

But there were gaps and limits. Some information is proprietary or privately held, and some is too sensitive to hand out— such as the location of critical power or water utilities assets that, if attacked, could plunge a city into darkness or poison a water supply. And then sometimes the maps don’t line up, a common problem in geospatial IT, because there are a dozen different coordinate systems plotting the earth’s surface slightly differently—or because there weren’t coordinate systems used at all.

Mapping a Growing Campus

The UIC campus is located in central Chicago, southwest of where two major freeways cross. It’s in near-west Chicago, not far from where, less than two decades after the first school was founded, the city burned and rebuilt. With roots in a small private pharmacy college founded in 1859, it grew to include medical and dentistry schools, and became a founding part of the University of Illinois system a decade later. By 1987, there were more than a dozen colleges ranging from architecture to urban planning. This past year, about 33,000 students were enrolled. The university is in the midst of developing a south campus, which gives Cruz’s team the opportunity to go look in trenches.

“We could validate some when crews would go work on a project and we would go there and see what they found,” she said. “Even at UIC, the campus evolves and it will change, possibly [with] new street alignments. It’s something we will keep looking at as they dig the foundations to the new buildings.”

As with most machine learning efforts, having as much data as possible is vital to accuracy and comprehension, she said. But the conclusions on which locations are inferred have to be checked in the real world, requiring mapmakers to ground truth.

Cruz and her colleagues are training an algorithm using data about construction history and practices. Other included data are points where infrastructure has been identified—like a water main repair 20 years ago that marks exactly where the water line is located—to build a predictive map of where hundreds of miles of underground pipes, tunnels, channels and passageways may be found.

“As a data scientist, I expect to validate things,” she said.

The project is of limited scope because of the nature of the funding: Foundation grants fund experimental research rather than huge commercial projects. It’s the experimental part that makes the work interesting, according to Cruz.

“Our focus is on using data science to complement the other disciplines,” she continued. “Environmental engineers can publish what computer science helped them prove or do. They learn to bring data integration to their own projects.”

With funding for the three-year project ending in August, Cruz is already tackling other cityscape projects. She likes working where the rubber meets the road—or beneath it.

“It’s very interesting doing this work, I learn something new every day. These are completely new problems we are finding and solving —and when I ask people how they can solve a certain problem we’re having, they say there is no method.”