How Can We Accurately Digitize Notes on Paper Maps from Field Operations?

Using image recognition software we can increase the speed from field markups to digital data faster.

How Can We Accurately Digitize Notes on Paper Maps from Field Operations?


Many industries still utilize paper documents to get work done despite the adoption of iPads for work in field operations. Cities are involved with many workflows that collect information from the field on paper maps and then convert that to a digital format in the office. This can include inspecting pipes for damage, marking what street segments need to be repaved, and selecting streetlights that have burned out. The current way of doing things requires a manual translation, usually someone in GIS, to look at a marked up map and mimic that in their maps. In the case of pipe inspections, this involves hand selecting each pipe segment so it can be highlighted in a map for easy review by an engineer or city manager. This can be inefficient and sometimes inaccurate if dozens or even hundreds of pipes have been selected. We're going to look at how we can use Google's Cloud Vision API to make this a bit easier.


Here's a flow diagram to show how this works in the context of wastewater pipes being inspected and selected for repair:

(1) Markups, hand-drawn notes, are made in the field during the inspection process. The pipe identifiers that need to be reconstructed are crossed out so they can't be read by the Cloud Vision API (more explanation later).
(2) The markups are passed off to GIS, and they are scanned into a digital format. A local Python script is run and sends the image data to the Google Cloud Platform (GCP).
(3) The pipe identifiers that were present are copied into a CSV file for reference.
(4) Using the identifiers from step three, we use a query to select the pipes present. Then, we reverse that selection and have a result that contains only the pipes chosen for repair.

Google Vision API

Google Vision API is used to work with a variety of image-based data. You can assign labels to images and quickly classify them into millions of predefined categories. Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog. Vision API uses OCR to detect text within images in more than 50 languages and various file types. This is a powerful way to quickly make sense of your data and make your organization more efficient. There's a growing list of applications that this can be used for and we're applying it to paper maps here.


Let's refer to the map below to help us understand how the process laid out above works. The map below shows an area in Hillcrest with wastewater pipes in black and manholes in red. For each pipe segment, the unique identifier is displayed for that pipe. This is the critical information used for field crews to communicate to GIS staff what pipes need to be reconstructed. Keep in mind the context of this: field crews are going out to inspect pipes and will record their findings on a paper map.

Displayed below is that same map, but it has been marked up with edits in the field. You can see the pipe segments that were identified highlighted in green with the corresponding identifiers crossed out. The crossed out identifiers won't be legible when processed by the vision API. When we process the image, we get a list of all the pipes that have NOT been highlighted. In other words, all the pipes with clearly labeled identifiers are read by the vision API. This is good for us: we can run a query to select all those pipes from the resultant CSV file, then reverse that selection to be left with what the team identified in the field.

Future Work

This approach can help save time and improve accuracy by automating an essential step in the process of capital improvement projects. Rather than manually selecting each pipe from the map, we can scan it and run some code. There are some non-technical challenges to consider, and these are usually more difficult to overcome. Field crews would have to be retrained to ensure the proper pipe segment identifiers are crossed out. Pipe identifiers would have to be displayed, and this could be a challenge at larger scales with hundreds of pipe segments. A method to check if everything worked properly would have to be developed as well.

Despite these challenges, I think this approach illustrates what's possible with cloud services that extract data (text) from images. Something as routine and common as digitizing new data from paper documents can be made easier and faster. With local governments always pressed for resources, many small process changes can lead to significant improvements and means better cities for all. What other workflows do you think this same approach could be applied to? I'm curious to know, and I'd like to hear from you, email me at

Quick note on cloud providers

I frequently use Amazon Web Services in my posts here on The Geo Cloud. I do this because it's the platform I learned first and the one I'm most comfortable with. For the most part, everything I write about can be done on other cloud platforms. In this case, I used the Google Cloud Platform because it worked better than Amazon Textract in pulling the pipe identifiers off the map. In the future, I hope to have more time to explore other cloud platforms, both large and small.