GENERAL VECTORIZATION OF LINE OBJECTS IN DRAWN MAPS GENERAL VECTORIZATION OF LINE OBJECTS IN DRAWN MAPS

easier. These objects are represented in a very similar manner so if the right methods are applied for each stage this process can be used for a wide range of different maps. Besides line objects, maps usually contain other objects like text, symbols and regions. In this stage line objects are separated from others. The result is a binary image where all line objects Drawn maps consist of multiple object types. The most important are line objects which represent infrastructure. Attributes of these objects are essential for many tasks but in raster format they provide only low level information. Vectorization must be used to obtain vector data. In this paper general vectorization process consisting of five stages is proposed. For these stages short discussion and basic recommendations are given and some proper methods are presented.


Introduction
Extraction of graphical information from raster maps is a complex problem. Vector data which represent the necessary information on networks and need less space for their storage are used in many tasks and are essential for GIS and CAD applications. Drawn maps consist mainly of line objects whose length is much larger than the thickness. These objects may represent road infrastructure but the raster data provide only low level information. Vectorization process needs to be used to transform this data to vector form. Conservation of connectivity and shape of the line objects are two most important conditions for vectorization process.

Vectorization process
There are three main approaches which can be used in vectorization process: sparse-pixel vectorization (SPV), matching opposite contours (MOC) and skeletonization.
SPV [1,2] is fastest from the mentioned approaches because it processes only small portion of pixels in image. The main disadvantage of SPV methods is that they not preserve connectivity if the parameters are not set up correctly. MOC methods [2,3] don't visit all pixels and work either directly on the contours or on a polygonal approximation of it. However hard-to-master thresholds and heuristics need to be used for complex drawings [2].
Skeletonization methods compute medial axis called skeleton. There are many papers about different skeletonization techniques [4,5]. Some of these techniques can directly produce vectors. The most common are thinning techniques [6,7,8] which remove outer pixels layer by layer in iterative process until one pixel thick skeleton is produced. Thinning conserve connectivity and shape of the objects and is used in the proposed vectorization process. On the other hand thinning is sensitive to noise and sometimes produces distortion in junction points.
The process of vectorization proposed in this paper consists of 5 stages (Fig. 1) and it is focused on line objects. Because of huge variety of possible representations of individual objects, there is no specific vectorization process which can deal with all drawn maps. When dealing with line objects this situation is easier. These objects are represented in a very similar manner so if the right methods are applied for each stage this process can be used for a wide range of different maps.

Segmentation
Besides line objects, maps usually contain other objects like text, symbols and regions. In this stage line objects are separated from others. The result is a binary image where all line objects should be represented as black pixels and all other objects including the background as white pixels. This reduces amount of data and it also speeds up and simplifies future processing.
The separation of line objects is accomplished by thresholding. A general condition for threshold can be defined as follows: ( 1) where g(x,y) is the value of pixel with coordinates x, y in a resulting binary image, f (x,y) is the original value of pixel and T is a threshold value.
Several threshold values are usually used to correctly separate objects in image. There are many local and global threshold techniques which can be used. Although a correct automatic set up of threshold parameters for all kinds of maps is not possible some type of automation is achievable [9, 10, 11].

Pre-processing
In pre-processing, a binary image is processed to remove imperfections and to amplify desired features of line objects. A binary image should be improved according to weaknesses of techniques used in a processing step to prevent future errors. Imperfections in a binary image are caused by a bad condition of paper maps, process of scanning, usage of segmentation methods, complexity of map and variety of possible representations. In Fig. 2 thinning was used to create a skeleton from an inaccurate input. The acquired skeleton differs from the desired skeleton because thinning is sensitive to imperfections in the original image.
Accurate pre-processing for thinning should fulfill these tasks: G remove isolated small objects G reduce boundary noise (contour intrusions and protrusions) G fill small holes in objects G connect disconnected objects , , , Binary morphology operators opening and closing provide very good results for this kind of tasks. Operations opening and closing can be combined to remove all the mentioned imperfections. Order of operations and number of repetitions are the most important decisions which need to be made in order to obtain accurate results [12].

Processing
In this step the binary image is prepared for raster-to-vector conversion. Features like thickness of objects, information about their contours and number of components can be extracted and some level of feed back is possible by measuring the quality of result [13] and analyzing extracted features.
Thinning which produces a skeleton is used in the proposed vectorization process. The skeleton is ideal for line objects because it is represented by a set of one pixel thick lines which are a natural representation of line objects such as roads. Thinning should fulfill these requirements: G Skeleton should be one pixel thick G Connectivity should be preserved G Shape and position of the junction points should be preserved G Skeleton should lie in the middle of a shape (medial axis) G Skeleton should be immune to noise (especially to boundary noise) G Excessive erosion should be prevented (length of lines and curves should be preserved) As noted previously, thinning is sensitive to noise and to other imperfections like holes in objects. These problems can be solved by binary morphology operators recommended in the previous chapter. Another often mentioned disadvantage is a low computational speed which can be improved by using the contour approach to thinning proposed in [14].

Raster-to-vector conversion
Conversion to vector form for one pixel thick skeleton can be performed in two main steps: nodes and edges recognition. Usually the local approach is used. In this approach 3ϫ3 neighborhood of each pixel is inspected. In this case candidates for nodes can be recognized by having a connectivity number (CN) Ͼ 2 (number of black to white transitions in 3ϫ3 neighborhood). Nodes are selected from these candidates based on additional rules. A special case of nodes are end points which can be recognized by having CN ϭ 1. In the next step edges are recognized. Edge pixels have CN ϭ 2 so they can be easily traced. This approach produces excessive nodes and edges and need further processing to yield accurate results.
Another possibility is to form clusters of candidates for nodes. For each candidate the priority number based on the number of 8-neighbors (N8) candidates and the number of 4-neighbors (N4) candidates is computed. For each cluster, the candidate with R E V I E W

Fig. 2 Desired and acquired skeleton of input image
largest priority is selected for a node. This approach is used in the proposed vectorization process. The difference between the local approach and cluster approach is shown in Fig.3.

Post-processing
After the raster-to-vector conversion is done vector data usually contain a large number of vertices that can be reduced by some kind of polygonal approximation [15,16]. Also straight lines and arcs can be recognized in this phase [17]. Vector data can be used to perform additional processing including pruning, removing incorrectly separated objects, improving quality of junction points and recognizing attributes such as length, width and color of the edges.

Conclusion
The general vectorization process for line objects in drawn maps was discussed in this paper. For each of its steps a short discussion of the problem was given and suitable methods were recommended. The vectorization process is based on thinning which conserves connectivity and shape of line objects. The operator is required only at the beginning of the process to set up parameters. Processing and raster-to-vector conversion are fully automatic and do not require any settings.
The results of the proposed vectorization process are shown in Figs. 4 and 5. The processing time for the map in Fig. 4 with dimensions 985ϫ1114 pixels was less than 2 seconds. For the map with dimensions 600x396 pixels shown in Fig. 5, processing time was less than 0.1 second.