Wednesday 28 October 2009

Using MySQL spatial extensions in historical GIS

This post describes some aspects of using MySQL spatial extensions in the historical GIS application Regnum Francorum Online. Because I was already using MySQL to store evidence of historical events, the choise to try out the spatial extensions of MySQL that were introduced in version 4 was very close at hand. The geographical information system Regnum Francorum Online (RFO) is a MySQL database with a number of interrelated database-tables. The Events SQL-table holds information about the historical event (time, type of event, source-document etc.) and is linked to evidence of places and actors of the event. In turn, the Evidence of place is linked to entries in the Places-table which will be the example of this post. The Places SQL-table consists of the following columns: id, name, type, country, official geographic code and coordinates. The coordinates column (pt) is defined as the POINT-geometry type, holding a single longitude-latitude pair. Indexes are constructed for the columns id, name and a spatial index is constructed for the pt-column.
CREATE TABLE places (
id INT 8 UNSIGNED NOT_NULL AUTO_INCREMENT,
name BLOB(255) NOT_NULL,
type INT(3) UNSIGNED,
cc CHAR(2),
cog CHAR(8),
pt POINT NOT_NULL,
PRIMARY KEY (id),
KEY (name),
SPATIAL KEY(pt),
)

Now the Places-table is ready to be populated with data. In real life this has of course been a long process to collect data for more than 11,000 places currently in the database. From the beginning, name, country and coordinates were collected from the GeoNames geographical database. The GeoNames service is still the only service that I know of, which allows you to geocode adresses worldwide for free. Due to license restrictions in the Geocoding service of Google Maps API, retreiving coordinates for other purposes than showing a Google map is not allowed. Collecting official geographic codes is a work in progress, data is provided by state-agencies in the different countries. This far, geographic codes for France from INSEE has been added. This data is crucial for the distinct identification of places, and holds information about the administrative belonging of a certain place, e.g. Quierzy (a carolingian palace) has the code 02631, which can be translated into département Aisne, arrondissement Laon, canton Coucy-le-Château-Auffrique, commune Quierzy.
INSERT INTO 'places' VALUES (51,'Quierzy',3,'FR','02631',GeomFromText('POINT(3.1440379 49.5708778)') )

GeomFromText and Point are two MySQL-functions building the binary representation of the POINT Geometry-type. The more than 11,000 places in the FRO-database are now ready for very fast retrieval, based on SQL-queries that take the limits of the output map into account, together with other features of the map selected by the user.


Example: Coins of Pepin, Charlemagne and Carloman
Let's say we want to retrieve all evidence of mints in the Carolingian kingdom of the Francs. The evidence of mints are, for example, the different coins published in various catalogs. One such catalog is Les monnaies royales de France sous la race Carolingienne, deuxième partie by Ernest Gariel, Strasbourg 1884, available at Google Books, and the Internet Archive, containing all known coins at that time issued by king Pepin (752-768), Charlemagne (768-814) and Carloman (768-771). The evidence of the different coins with information of the mint on one side of the coin, and the name of the king on the other, is suitable to put in the Events SQL-table: no. 73. silver coin of type denier with inscription RP· | +TRI/CAS, that is, evidence of actor (RP = king Pepin 752-768), and evidence of place (TRI/CAS = Troyes, dép. Aube). The Evidence of place SQL-table is connecting the event with the place, holding ID:s of column-type UNSIGNED INTEGER(8) for the Events and Places SQL-tables respectively.


Follow this link to see this example in the real database application.


Retrieving data from a geometry column type POINT is very simple, use the MySQL-function AsText() to SELECT the values as a string-representation of the stored binary values, in the example above Troyes has the coordinates "4.0748 48.2975".
SELECT name, AsText(pt) FROM places
Individual values of longitude and latitude can be retreived with the functions X() and Y() respectively. However, we will take this a step further and show some other features of the spatial extensions as well. Values of a spatial enabled SQL-table can be selected using minimal bounding rectangles (MBR:s), in this case, all places with evidence of mint inside the minimal bounding rectangle defined by @g of the output map, will be selected. The first step is to set the MBR of the output map using it's upper-left and lower-right corner, expressed as two pairs of longitude and latitude. The $variables are defined in the PHP-environment.
mysql_query("SET @g = GeomFromText('LineString($longitude1 $latitude1, $longitude2 $latitude2)')");

Next we will compose the SQL-query for the evidence of all mint, within the selected time-period ($minyear - $maxyear). Placeindex is the name of the SQL-table holding the evidence of places in events, the event-type of minting is represented by the numerical value 163. The query uses GROUP BY to sum the rows by unique places of minting in the database, and to put the number of evidence in the cnt (count) column.
$query="SELECT
 placeindex.pid,
 places.name,
 X(places.pt),
 Y(places.pt),
 events.type IN(163) as mint,
 COUNT(*) as cnt
FROM placeindex, places, events
WHERE MBRWithin(places.pt,@g)
 AND placeindex.pid=places.id
 AND placeindex.eid=events.id
 AND events.type IN (163)
 AND events.maxyear >= '$minyear'
 AND events.maxyear <= '$maxyear'
GROUP BY placeindex.pid";

Now we perform our SQL-query and draw the map, which is saved as a PNG-image in a temporary-directory on the server. The member-function gRef() is where the fetched coordinates (decimal longitude and latitude) of the cities are transformed into pixel-cordinates in the UTM-projection of the current map. This is not explained in this post.
[PHP code (simplified) running inside our map-drawing class]:
$this->img = imagecreatetruecolor($this->width, $this->height);
$pic_coin = imagecreatefromgif ($_SERVER["DOCUMENT_ROOT"]."/pics/coin.gif");
$result = mysql_query($query);
while($row = mysql_fetch_row($result)) {
 $pid = $row[0];
 $oname=htmlspecialchars($row[1], ENT_QUOTES);
 $mint = $row[4];
 $this->gRef($row[2], $row[3]);
 $px= $this->Xp();
 $py= $this->Yp();
 imagecopymerge ($this->img, $pic_coin, $px-8, $py-8, 0, 0, 16, 16, 100);
} // end while
imagepng($this->img,$this->filename);
imagedestroy($this->img);

Conclusion
The conclusions are coming soon.

2 comments:

  1. Johan, this looks very interesting to me, not only from a technical point of view.

    As I'm particularly interested in open data for archaeological research (with a broad definition of “archaeology”) may I ask you about your choice of CC-BY-NC as license for RFO data? I.e. what is your opinion about using a less restrictive license ? You might want to look at http://www.opendefinition.org/ for a more in-depth discussion of why non restrictive licenses are important.

    All the best,
    Stefano

    ReplyDelete
  2. Well, I started out with a restrictive licence because of the preliminary and experimental nature of the whole project. When things get more stable, and the future of the project is more settled, licencing is likely to open up. It will probably be possible to download entire map-layers from Regnum Francorum Online.
    /Johan

    ReplyDelete