Friday, March 19, 2010

Losing Your Religion: Interoperability with AutoCAD Map 3D and ESRI, Part 2

A little while ago I started this interoperability discussion by discussing the similarities between CAD and GIS, primarily Autodesk’s AutoCAD and ESRI’s ArcGIS. Now I’ll talk about the critical differences between them, or at least the data formats. These differences are critical to understanding how to manage interoperability.
There are two primary areas of difference. One is the data structure paradigm, and the other is the graphic representation. AutoCAD drawings and ESRI data sets store data in fundamentally different methods. They are both forms of databases that store information about the location, properties and appearances of the various objects, but because they have substantially different requirements, they have to organize the data differently.

Data Structure Paradigm
CAD is used for all types of drawing. The CAD drawing file is essentially an object-oriented database which stores objects sequentially (essentially as they’re drawn). Each row of data will represent an individual CAD primitive object. The structure of the data and the number of elements is dependent upon the type of primitive. For example, a point is going to carry a single coordinate pair (X,Y) for its location while a line will store two coordinate pairs – a start point and an end point. A curve will have a start point, an end point, and a bulge (or curve) factor. Along with that, there are additional data elements describing the color, line type, layer and other properties.



























Entity IDLineSt PointEnd PointLayer
Entity IDPointIns PointLayerColor
Entity IDBlockIns PointLayerColor
Entity IDArcSt PointEnd PointBulge

An ESRI GIS dataset, whether it is a shape file, geodatabase or personal geodatabase organizes the data into more formal structures, in the form of tables (this is simplified to a conceptual level – each of these data formats include several files or tables to complete the dataset, but are not really germane to the discussion). Different primitives, such as points, lines and polygons can’t reside in the same set of tables. In addition, the number of data elements in each row will be consistent with the dataset. Points representing valves will be in a different table than the lines representing the pipes they’re attached to. The tables will be divided based on some set of business rules to organize the data. In the ESRI terminology, this is essentially a Feature Class. For example, water, storm and sanitary sewer lines may all be in one table, or they may be divided into 3 or more tables. The division may be due to organization, or due to the different information needed for each group. Many times within each Feature Class, there will be a further subdivision of objects, such as high-voltage conductor and low-voltage conductor, called a Subclass. Typically the subclass will be the level of organization used to symbolize the objects. The result is a very structured organization of data.





















Feature Class (Pipes – Lines)
IDShape (BLOB)SIZEMATERIALIN USE
IDShape (BLOB)SIZEMATERIALIN USE






















Feature Class (Vegetation – Polygon)
IDShape (BLOB)SPECIESAGEAVG DBH
IDShape (BLOB)SPECIESAGEAVG DBH

The analogy that I typically use, and it seems to fit, is that the data sets are like a collection of coins. My AutoCAD file is like a pile of change and my ESRI data file is like the same group of coins all organized into paper tubes.
coins





The take away from all of this is that an AutoCAD drawing will store multiple data types in a single drawing file, while the ESRI data sets will store multiple data types in multiple tables (and/or files). This is a critical point to managing interoperability.



Graphic Representation
The other major area of difference is with the graphic representation. The AutoCAD drawing includes information regarding the appearance of the objects. For example, a line will include the color, line type, and thickness. Each of these properties is inherent in the primitive object. These properties define how AutoCAD will display the file. If I pass the file to someone else, and they open it, it will look the same.
ESRI datasets are a different case. The datasets are not related to the appearance of the data. The appearance is left up to the application at the time of display. ArcGIS, for example, stores the appearance of a map in a Map document, which contains pointers to the data, describing what data to select (allowing a subset of the data through a query) and how to display it. It is the map document that contains the symbolization information, such as linetype, color and stylization.

viewing
The Result
These two differences present the primary difference to interoperability. Neither one nor the other is inherently better - they have different ways of achieving similar results. The data organization presents some challenges when bringing CAD data into GIS tools, for example, when reading an AutoCAD DWG file in ArcGIS, it reads the data as if it were ESRI datasets, and groups objects by their primitive forms, such as lines. That’s different than the way CAD users think of and manage the data. Additionally, CAD drawings typically contain information that doesn’t fit into the GIS data model. The separation of the data from the symbolization is what allows GIS systems to display the same information is many ways depending on the view or analysis needed. This also makes it a little more difficult when transferring data between systems. There is no direct method within AutoCAD to read the Map document to get the shortcuts and symbolization and replicate an ArcGIS map without recreating the symbolization. In most cases this is not really an issue because the data is the important part, although it can be problematic when you want to reproduce the entire map.
There are some other differences between systems that are important to be aware of, such as shape files not dealing with arcs (causing arcs to be broken into many small lines), and single or double precision data differences. These are important to be aware of, but not as critical to the basic interoperability of the systems.

One of the reasons I like working with Autodesk's AutoCAD Map 3D product is that it provides me with both worlds. It is an AutoCAD drawing, and with the Feature Data Objects connectors, I can work with the ESRI datasets natively without having to make any changes in the way I work with these disparate data types.

In most cases, there are business issues that interfere with interoperability that have a much greater impact than these technical software elements. I’ll explore those in a future blog.

No comments: