Saturday 31 October 2009

Working with spatial data in binary representation in MySQL and PHP

In an earlier post I showed how to store and retrieve spatial POINT geometries in MySQL and PHP. A POINT is a geometry that represents a single location in coordinate space. In this post we will be dealing with more complex spatial data such as LINESTRING and POLYGON, in real world representing for example rivers, roads, or territories in coordinate space. A number of Shapefiles (SHP) describing territories and division of kingdoms in early medieval Europe have been imported into MySQL. The geometries are drawn directly from the SQL-database using PHP-script, building the basemaps of the historical GIS-application Regnum Francorum Online. This article will discuss how this was accomplished. A Linestring is a one-dimensional geometry represented by a sequence of points, whereas a Polygon is a planar surface representing a multisided geometry, with a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the polygon. The first and last Point of the exterior boundary is the same, we say it's closed. Here we will only deal with polygons without interior boundaries.

In order to get these geometries on a computer screen map we must be able to somehow retreive the single points building these geometries and transform them from coordinates to pixels, and then use some graphics functions to draw lines and polygons. The following steps were used to achieve this task.

1. Define a minimal bounding rectangles (MBR) representing the boundaries of the map we will display on the screen and compare this geometry with the lines and polygons in the database and see if they should be selected. If this box is expressed in coordinate space we will be able to take advantage of functions comparing the spatial relationship between two geometries in MySQL. The MBR of the following Linestring reaching from one corner to the opposite, represents the map given its width, height and scale.
SET @g = GeomFromText('LineString($longitude1 $latitude1,$longitude2 $latitude2)')

2. The geometries holding the coordinates for rivers or territories has a spatial column of type LINESTRING or POLYGON and has been given the name SHAPE2 on all database-tables of this kind (layers). We will Select values from this column that Intersects the MBR defining the boundaries of the map (@g). We will then retreive the data as a binary string (BLOB). The Well-Known Binary (WKB) representation for geometric values is defined by the OpenGIS specification. WKB uses one-byte unsigned integers, four-byte unsigned integers, and eight-byte double-precision numbers. For fast access of the data, we use the Unbuffered Query in MySQL.
$query="SELECT AsBinary(SHAPE2), color FROM $layer WHERE MBRIntersects(SHAPE2,@g)";
...
$result=mysql_unbuffered_query($query);


3. In the code snippet above, we are retreiving two columns of each row in the database table. Unpack is a PHP-function that unpacks data from a binary string into an associative array according to a given format, in this case defined by the WKB. The first parameter of the geometry is the byte order (1 byte), the second, geometry type (4 bytes), and the third, number of elements (4 bytes). This sums to 9 bytes, and at this position (offset) in the binary string, we will find the number of elements of the given geometry type. The class-function drawLine deals with Linestrings and the function drawRings deals with Polygons. We will not deal any further with Multipolygon and Point geometries.

while($row = mysql_fetch_row($result)) {
$color=$row[1];
$g = unpack("Corder/Ltype/Lnum",$row[0]);
$type = $g['type'];
$num = $g['num'];
$offset = 9;
switch($type) {
case 1: // POINT
break;
case 2: // LINESTRING
$this->drawLine($offset,$num,$color,$row[0]);
break;
case 3: // POLYGON
$this->drawRings($offset,$num,$color,$row[0]);
break;
case 6: // MULTIPOLYGON
break;
}
}
mysql_free_result ($result);

4. Now we are ready to transform the coordinates to pixels on the map. We will begin with the Linestring-geometry. Offset is the position in the binary string ($row) where we will start to read. $numpts is the number of points in the Linestring. From the offset position in the buffer we start to unpack into the points-array (pts) eight-byte double-precision numbers (d) until we reach the end of the buffer (*). Next we assign an array geom, that will hold the resulting pixel-values of the points defining the Linestring. When reading the points-array, we know that the first number is the longitude and the second the latitude, on so forth. When reaching every second number we have a longitude/latitude pair that will be transformed using the gRef-function. The resulting pixel-values are assigned to the geom-array. Then we use the PHP graphics-function imageline to draw a line-segment between x1,y1 and x2, y2, in the given color.

function drawLine($offset,$numpts,$color,$row) {
$pts = unpack("@$offset/d*", $row);
$geom = array();
$lon=0;$lat=0;
$odd=true;
foreach ($pts as $value) {
$odd ? $lon=$value : $lat=$value;
if (!$odd) {
$this->gRef($lon, $lat);
$geom[]= $this->Xp();
$geom[]= $this->Yp();
}
$odd=!$odd;
} // end foreach
$max=($numpts-1)*2;
for ($i=0;$i < $max; $i+=2) {
imageline ( $this->img, $geom[$i], $geom[$i+1],
$geom[$i+2], $geom[$i+3], $color);
}
} // end function

5. Now we deal with the Polygons. In it's core this is the same function as the one for Linestrings, because polygons are Linestrings with the same start- and end-point.

function drawRings($offset,$numrings,$color,$row){
$off=$offset;
$x=$numrings;
while($x > 0) {
$h=unpack("@$off/Lnumpts",$row);
$numpts=$h['numpts'];
$off+=4;
$nump=$numpts*2;
$pts = unpack("@$off/d$nump", $row);
$geom = array();
$lon=0;$lat=0;
$odd=true;
foreach ($pts as $value) {
$odd ? $lon=$value : $lat=$value;
if (!$odd) {
$this->gRef($lon, $lat);
$geom[]= $this->Xp();
$geom[]= $this->Yp();
}
$odd=!$odd;
} // end foreach
imagefilledpolygon ( $this->img, $geom, $numpts, $color);
// If we wish to draw a border
imagepolygon ( $this->img, $geom, $numpts,
$this->bordercolor);
// Increase byte pointer
$off+=($nump*8);
$x--;
} // end while x
return($off);
} // end function

Wednesday 28 October 2009

Using MySQL spatial extensions in historical GIS

This post describes some aspects of using MySQL spatial extensions in the historical GIS application Regnum Francorum Online. Because I was already using MySQL to store evidence of historical events, the choise to try out the spatial extensions of MySQL that were introduced in version 4 was very close at hand. The geographical information system Regnum Francorum Online (RFO) is a MySQL database with a number of interrelated database-tables. The Events SQL-table holds information about the historical event (time, type of event, source-document etc.) and is linked to evidence of places and actors of the event. In turn, the Evidence of place is linked to entries in the Places-table which will be the example of this post. The Places SQL-table consists of the following columns: id, name, type, country, official geographic code and coordinates. The coordinates column (pt) is defined as the POINT-geometry type, holding a single longitude-latitude pair. Indexes are constructed for the columns id, name and a spatial index is constructed for the pt-column.
CREATE TABLE places (
id INT 8 UNSIGNED NOT_NULL AUTO_INCREMENT,
name BLOB(255) NOT_NULL,
type INT(3) UNSIGNED,
cc CHAR(2),
cog CHAR(8),
pt POINT NOT_NULL,
PRIMARY KEY (id),
KEY (name),
SPATIAL KEY(pt),
)

Now the Places-table is ready to be populated with data. In real life this has of course been a long process to collect data for more than 11,000 places currently in the database. From the beginning, name, country and coordinates were collected from the GeoNames geographical database. The GeoNames service is still the only service that I know of, which allows you to geocode adresses worldwide for free. Due to license restrictions in the Geocoding service of Google Maps API, retreiving coordinates for other purposes than showing a Google map is not allowed. Collecting official geographic codes is a work in progress, data is provided by state-agencies in the different countries. This far, geographic codes for France from INSEE has been added. This data is crucial for the distinct identification of places, and holds information about the administrative belonging of a certain place, e.g. Quierzy (a carolingian palace) has the code 02631, which can be translated into département Aisne, arrondissement Laon, canton Coucy-le-Château-Auffrique, commune Quierzy.
INSERT INTO 'places' VALUES (51,'Quierzy',3,'FR','02631',GeomFromText('POINT(3.1440379 49.5708778)') )

GeomFromText and Point are two MySQL-functions building the binary representation of the POINT Geometry-type. The more than 11,000 places in the FRO-database are now ready for very fast retrieval, based on SQL-queries that take the limits of the output map into account, together with other features of the map selected by the user.


Example: Coins of Pepin, Charlemagne and Carloman
Let's say we want to retrieve all evidence of mints in the Carolingian kingdom of the Francs. The evidence of mints are, for example, the different coins published in various catalogs. One such catalog is Les monnaies royales de France sous la race Carolingienne, deuxième partie by Ernest Gariel, Strasbourg 1884, available at Google Books, and the Internet Archive, containing all known coins at that time issued by king Pepin (752-768), Charlemagne (768-814) and Carloman (768-771). The evidence of the different coins with information of the mint on one side of the coin, and the name of the king on the other, is suitable to put in the Events SQL-table: no. 73. silver coin of type denier with inscription RP· | +TRI/CAS, that is, evidence of actor (RP = king Pepin 752-768), and evidence of place (TRI/CAS = Troyes, dép. Aube). The Evidence of place SQL-table is connecting the event with the place, holding ID:s of column-type UNSIGNED INTEGER(8) for the Events and Places SQL-tables respectively.


Follow this link to see this example in the real database application.


Retrieving data from a geometry column type POINT is very simple, use the MySQL-function AsText() to SELECT the values as a string-representation of the stored binary values, in the example above Troyes has the coordinates "4.0748 48.2975".
SELECT name, AsText(pt) FROM places
Individual values of longitude and latitude can be retreived with the functions X() and Y() respectively. However, we will take this a step further and show some other features of the spatial extensions as well. Values of a spatial enabled SQL-table can be selected using minimal bounding rectangles (MBR:s), in this case, all places with evidence of mint inside the minimal bounding rectangle defined by @g of the output map, will be selected. The first step is to set the MBR of the output map using it's upper-left and lower-right corner, expressed as two pairs of longitude and latitude. The $variables are defined in the PHP-environment.
mysql_query("SET @g = GeomFromText('LineString($longitude1 $latitude1, $longitude2 $latitude2)')");

Next we will compose the SQL-query for the evidence of all mint, within the selected time-period ($minyear - $maxyear). Placeindex is the name of the SQL-table holding the evidence of places in events, the event-type of minting is represented by the numerical value 163. The query uses GROUP BY to sum the rows by unique places of minting in the database, and to put the number of evidence in the cnt (count) column.
$query="SELECT
 placeindex.pid,
 places.name,
 X(places.pt),
 Y(places.pt),
 events.type IN(163) as mint,
 COUNT(*) as cnt
FROM placeindex, places, events
WHERE MBRWithin(places.pt,@g)
 AND placeindex.pid=places.id
 AND placeindex.eid=events.id
 AND events.type IN (163)
 AND events.maxyear >= '$minyear'
 AND events.maxyear <= '$maxyear'
GROUP BY placeindex.pid";

Now we perform our SQL-query and draw the map, which is saved as a PNG-image in a temporary-directory on the server. The member-function gRef() is where the fetched coordinates (decimal longitude and latitude) of the cities are transformed into pixel-cordinates in the UTM-projection of the current map. This is not explained in this post.
[PHP code (simplified) running inside our map-drawing class]:
$this->img = imagecreatetruecolor($this->width, $this->height);
$pic_coin = imagecreatefromgif ($_SERVER["DOCUMENT_ROOT"]."/pics/coin.gif");
$result = mysql_query($query);
while($row = mysql_fetch_row($result)) {
 $pid = $row[0];
 $oname=htmlspecialchars($row[1], ENT_QUOTES);
 $mint = $row[4];
 $this->gRef($row[2], $row[3]);
 $px= $this->Xp();
 $py= $this->Yp();
 imagecopymerge ($this->img, $pic_coin, $px-8, $py-8, 0, 0, 16, 16, 100);
} // end while
imagepng($this->img,$this->filename);
imagedestroy($this->img);

Conclusion
The conclusions are coming soon.

Regnum Francorum Online historical GIS

Regnum Francorum Online — interactive maps and sources of early medieval Europe 614-840 is a historical geographic information system (GIS), aiming at referencing historical events of Merovingian and Carolingian Europe (Frankish kingdom) in time and space. The information system covers the time period approx. 614 to 840. Historical events are recognized through source-documents of different kind, mainly contemporary charter documents or copies of such documents, but also archeological evidence like coins. Meta-data about the events has been collected, including time and geographical locations of the events, type of event (donation, privilege, assembly, battle, siege etc.), actors involved in the event (historical persons like Charlemagne, or persons or groups identified by name) and links to source-documents available online. The information system is implemented as an online database-application running on a Apache-server, using MySQL with spatial extensions, PHP server-script and AJAX, producing interactive maps and inter-related output of historical and geographical information. Over the last years, a growing number of editions of primary sources have become available online in digital libraries such as Google Books, Gallica, Monumenta Germaniae Historica (dMGH), and Regesta Imperii, just to mention the largest collections. The main purpose of this information system is to provide an interactive geographical interface for the visualisation of the events and their historical context, connecting them to freely available online resources like full-text source documents and literature, but also other sources like medieval manuscripts, coins and maps.

I intend to use this blog to elaborate and discuss conceptual, methodological and technical issues regarding the development of Regnum Francorum Online. I have written my own PHP-class to draw the maps from coordinates stored in MySQL database-tables. The maps are currently drawn using the Universal Transverse Mercator - projection. There are a number of map-layers collected and compiled from various sources including own digitization of printed maps. In parallel to my own maps, basemaps from the Google Maps - service are also used to visualize layers with historical information. The blog will also be used to report on digitization of historical (medieval) sources, semantic web, historical GIS and online mapping in general (Google Maps API, Google Earth, Openlayers). Questions and comments are most welcome.