Wednesday 4 April 2012

Historical Monuments of France (before 10th century)


The French Ministry of Culture has made part of their database of historical monuments, Base Mérimée - monuments historiques, publically available to use under an Open Licence. The dataset can be obtained from the website data.gouv.fr in CSV-format. To increase the density of archaeological information in Regnum Francorum Online (RFO), a subset of the base Mérimée dataset has been integrated into the GIS, that is, monuments that correspond to the aim of RFO and attested before the 10th century AD. The external dataset at RFO contains 1719 sites (out of 43720), covering monuments in 1224 French municipalities (communes). The Base Mérimée does not contain coordinates, so georeferencing of the monuments is done at runtime by joining the base Mérimée database table with another table containing all the French communes and their administrative center point. The INSEE-code of French communes is used as the common key. This means that all monuments in a municipality have the same set of coordinates. Links to the full records at the official base Mérimée are provided, which contains additional information not part of the free edition, e.g. a very useful classification of monuments.

A problem in the selection of sites for the subset is that many records are undated, however their description indicates they belong to the period before the 10th century. Undated records described by the terms (or in a combination) vestiges, romain, archéo[-logique], amphithéâtre, oppidum, anciens, antiques, borne milliaire, dolmen, tumulus, menhir, [paléo-, méso-, chalco-]lithique have also been selected. The Base Mérimée datset is implemented as its own geographical layer, and as additional information of French places in RFO. Click the link to load the layer at RFO, and from there, click the symbols on the map to see a list and description of monuments in the selected municipality.

The French Wikipedia community has started to author articles, one for each monument, using the information from Base Mérimee as a starting point, see further Monument historique (France) at the French Wikipedia. Quite a number of monuments are also individually georeferenced.

Monday 20 February 2012

Evidence of place in Late Roman sources

The Regnum Francorum Online (RFO) historical GIS does not exclusively concern the Merovingian and Carolingian Frankish kingdom, but also the Roman foundation on which the Early Medieval civilisations was built. To put some light on this, and describe the Late Roman empire, two layers are part of the GIS. Both layers are based on primary sources. The first layer is a compilation of Roman itineraries (e.g. Peutinger map, Itinerarium Antonini), description of the administrative and political order of the Late Roman empire (e.g. Notitia dignitatum), narrative sources and Roman places mentioned on the milestones. All together, there are 2190 evidence of place, distributed among 799 Roman places. The second layer contains all the milestones themselfes and their place of discovery. The milestone layer has 1039 entries distributed among 613 places of discovery. The layers cover the western half of the Roman empire north of the Alps and the Pyrenees, and extend into the Roman province Pannonia, modern Hungary. The latter layer is motivated by the importance of the Roman roads network, which is part of the background maps for the entire GIS. All the sources are listed at the end of this post, together with their availability in online digital libraries.


Figure 1: Places in Late Roman sources (red) and place of discovery of Roman milestones (blue) between the river Seine and Rhine, zoom level 7.

These sources have been georeferenced so when rendered on the map and receiving input by a mouse click, the source entries for the selected place will be listed in the output pane to the right. In the output pane a list of source entries for the selected place is displyed together with a citation from the source and links to digitized source editions. If the source edition is available on Google Books or the Internet Archive, you have the option to view the source inside the RFO application. Sources hosted in the Perseus digital library can also be viewed inside RFO, because they are available as XML-documents. The Peutinger database is serving small images of the map centered on the selected place, together with the transcription as part of the XML-output, see figure 2 below. The places currently on the visible part of the map can be listed by selecting the Places menu tab. Not all sources are fully complete (especially Ammianus Marcellinus and Ptolemy), but a vast majority of the places mentioned in the other sources should be in the database. As far as I know the milestones are fully complete within the area selected.


Figure 2: The entry of Augusta Treverorum (Trier, Rheinland-Pfalz, Germany) on the Peutinger map embedded into the RFO application.

Both layers are listed in the Layers menu tab, in the section Layers based on primary sources : Thematic layers, in the RFO application. They can be loaded following the links below.
Places in Late Roman sources

Roman milestones, place of discovery

Sources
Itineraries, ancient geography

Narrative sources

Milestones

Compiling sources of Ancient history

In this post I will demonstrate what can be achieved regarding knowledge of Ancient history by compiling persistant identifiers of several external resources. The coordination of sources, and hence the advantage of having them displayed together, is accomplished through the use of a common geographical interface, where layers of one or more external sources could be displayed.


The Regnum Francorum user interface showing georeferenced articles of the Princeton Encyclopedia of Ancient Sites, with the embeded article of Augusta Treverorum (Trier) in the right pane.

The Regnum Francorum Online (RFO) historical GIS has recently been updated with a new user interface, built on Google Maps API 3.x. The GIS uses its own base tiles for zoom levels 6-10, made by the Mapnik software. These tiles are based on geographical data from OpenStreetMap (OSM) and the SRTM elevation data from NASA. It also contains basic historic information of the Late Roman civitates and early bishoprics of Western Europe, established until the 9th century, borders of the Early medieval counties (pagi) and the Roman roads network. Further geographical details can be viewed through the use of a collateral secondary map, covering zoom levels above 10, that is, embedded maps from OpenStreetMap (OSM), Google Maps (streetmap, terrain and satellite), the 18th century Cassini map of France, and an experimental topographical map covering the Moselle and Rhine regions of France and Germany, combining data from OSM and SRTM. The aim of the historical GIS is to compile evidence of places, institutions and territories from both primary and secondary sources, and render them as layers on top of the basemap. If the source editions are available online, the GIS will provide links to, or embed these sources into the historical GIS.

Recently, I have introduced tags of identifiers from certain external sources, that contribute to the definition of a place. These tags refer to entries in online encyclopedias (Wikipedia and the Princeton Encyclopedia of Classical Sites), topographical dictionaries (Toponymisch woordenboek) and gazetteers (Pleiades ancient places), as well as information from other geographical systems such as OpenstreetMap, Cartes IGN (France), and Google Satellite images. The tags regarding the latter, refer to manually selected archaeological features, whether or not archaeological remains are visible on satellite images (Google), or defined on the maps from IGN and OSM. In the case of Google and IGN, links are provided to embedded versions of these maps within RFO; in the case of OSM, an external link to the geometry entry at the OSM geographical database is displayed.

On the Sources menu tab in the Regnum Francorum Online application, there is a row of buttons that enables direct access to the layers described above. The layers showing georeferenced source editons and literature, the first two buttons, are not new, they have been components of the Regnum Francorum GIS for a long time. From inside the RFO application, one or more layers can be loaded simultaneously. Only the topmost layer is interactive, however, the order of layers can easily be altered fromthe Layers menu tab. These are the new layers:

Selected Wikipedia articles in English, French, German and Dutch with historical or archaeological content. Currently, 867 places have one or more Wikipedia article. Typically, the Wikipedia articles selected here refer to an ancient place known by name, or a major archaeological site. Load this layer by clicking this link.

Selected OSM geometries (point or polygon). Currently 450 places have one or more OSM geometry. Archaeological sites and remains are typically described by a point or a polygon outlining the site or ruin. Load this layer by clicking this link.

Archaeological remains visible on Google satellite images. Currently there are almost 200 sites in the database. The quality of the images vary. Clicking the provided link in the result pane will embed the Google satellite map at zoom level 16. Load this layer by clicking this link.

Archaeological remains defined on the Cartes IGN (France). Currently there are 107 sites in the database. Clicking the provided link in the result pane will open the topographic map of France in a new tab or window. Load this layer by clicking this link.

This layer contains all the articles of the Princeton Encyclopedia of Classical Sites (PECS) concerning ancient places and archaeological sites in modern Austria, Belgium, France, Germany, Hungary, Luxembourg, Netherlands, Switzerland, and a couple of places elsewhere, in all over 700 places. The Perseus digital library that hosts the encyclopedia outputs articles and source texts in both XML and HTML. This makes it possible to embed the article inside the RFO application, which is of course very convenient. Load this layer by clicking this link.

This layer contains links to a growing number of entries in the topographical dictionary of Belgium, Netherlands, Luxembourg and adjacent provinces in France and Germany. I have selected places with evidence in sources no later than A.D. 950. The dictionary also contains places that are mentioned later during the Middle ages. This electronic resource from the University of Antwerpen is based on the Toponymisch woordenboek von België, Nederland, Luxemburg, Frankrijk en West-Duitsland (vóór 1226) by Maurits Gysseling, published in 1960. This particular area is of vital importance for the rise to power of the Arnulfing-Carolingian family in the the late 7th century. There are also many written evidence in Late-Roman sources for places in the provinces of Germania and Belgica covered by this dictionary – Trier, Augusta Treverorum, was once the capital of the Roman empire. So far, there are almost 550 places in the database with a link to the topographic dictionary. Load this layer by clicking this link.

A few examples


Below is the outcome of a click on a place called Heidetränk, located in Oberursel (Taunus) in Hesse, Germany, where there are extensive remains of a large Celtic oppidum. The image to the left is from the collateral map of the Moselle-Rhine region at zoom level 14. Normally, the archaeological sites in OSM display at zoom level 16 and above, but since this experimental map is intented to serve as a historial / archaeological basemap, their apperance is defined differently. The Heidetränk oppidum is defined in OSM by a series of polygons. Below are links to their entries in the OSM database. There are also links to articles in the German Wikipedia and the website Oppida : the first town north of the Alps, and a link to the Heidetränk entry in Pleiades Ancient Places gazetteer.

area: 130 ha; oppida=72; osm:way=137701370; osm:way=26535643; osm:way=27823337; osm:way=80406240; osm:way=89669777; osm:way=89671720; osm:way=89671727; osm:way=89671731; osm:way=89671773; pleiades:places=109038; wikipedia:de=Heidetränk-Oppidum


Below is the outcome of a click on a place called Mandeure, département Doubs, France, once a roman vicus, a small town with the remains of a theater. The image to the left is from the collateral Google satellite map at zoom level 18. The outline of the theater is defined in OSM by a polygon. There is also a link to an article in the Princeton encyclopedia of classical sites, which can be embedded in RFO, links to two articles in the French Wikipédia, a link to the Pleiades gazetteer, and finally a link to the embedded Google satellite image indicated by the satellite=yes link (this link is deactivated here).

geonames=2996251; osm:way=37846050; pecs=epamantadurum [embed]; pleiades:places=177516; satellite=yes; wikipedia:fr=Epomanduodurum; wikipedia:fr=Théâtre_antique_de_Mandeure

Following the link to OpenStreetMap where the polygon of the theater is defined, will display the following image, which is made up by 15 pair of coordinates.

Sunday 13 December 2009

Building an integrated historical geographical information system, part 1

Historical source-documents are increasingly becoming available on the Internet. It's a huge resource waiting to be elaborated. The sources often come in one of two formats. They are published as graphical scans of printed book pages with or without large chunks of unedited OCR-text, or they can be fully formatted as an electronic text, that can be copied, edited or searched. Often little or no metadata about the source documents are available. I will mention a couple of examples. Digital libraries such as Google Books or Gallica have huge amounts of scans of book-pages with OCR-text. A very rough estimation is that the accompanying OCR-text are correct to 60 percent. Searching for a term in the text will sometimes give a correct result, sometimes not. Maybe we can expect the OCR-technique to perform better in the future, I dont know. Otherwise we have to depend on humans taking responsibility for manual processing and correction of the texts. An example of the latter is the project Chartae Burgundiae Medii Aevi, University of Burgundy, in Dijon France, aiming at making all medieval charter editions of Burgundy available on the internet as fully corrected electronic texts, in part through own digitazation efforts, including publishing previously unedited texts from manuscripts online, correction of digitized text found at Gallica and Google Books. Currently there are 25 editions avialable, which are all downloadable in Text- or Word-format. Other examples of projects with high ambitions to publish fully corrected editions of historical source-documents are the digital Monumenta Germaniae historica (dMGH), Regesta Imperii (RI), and Codice diplomatico della Lombardia medievale (secoli VIII - XII).

It's important for scholarly work to have the possibility to cite/quote individual pages or source documents in editions. In order for electronical text to become part of the scientific research process they must fulfill these requirements. Currently there are a number of ways that these requirements are beginning to become fullfilled. The dMGH allows building of URL to indivudal pages through-out the entire body of all editions in Monumenta Germaniae historica, using widely known abbreviations to individual editions of source documents. The following example is an URL to a individual page in Gesta Dagoberti I. in: SS rer. Merov. 2. page 396 ("S." means "Seite", German for page).
http://www.mgh.de/dmgh/resolving/SS_rer._Merov._2_S._396

Next example demonstrates an URL built from widely used sequential number of royal diplomas in Die Urkunden der Karolinger 1 (Royal diplomas issued by king Pepin, Carloman and Charlemagne) The following two example refers to diploma no. 165, donation by Charlemagne to monastery Prüm (Rheinland-Pfalz) issued on the 9th of June 790 in Mainz (Mayence). Note the slight different composition of the two URL retrieving the same document by charter-number and page-number respectively.
http://www.mgh.de/dmgh/diplomata/resolving/D_Kar._1_165
http://www.mgh.de/dmgh/resolving/DD_Kar._1_S._222

Regesta Imperii is a series of source-summaries (regesta) ordering all written testimonials of Frankish and German kings chronologically throughout the middle-ages, with references to evidene in source editions. The lastest contribution is the first volume of source summaries of Charles the Bald, king of West-Francia (840-877), Die Regesten Karls des Kahlen 840 (823) - 877, edited by Irmgard Fees and published 2007. Not only evidence in diplomas but also evidence of activities of rulers in narrative sources. Regesta Imperii is also important for the listing of diplomas and other evidence of emperor Louis the Pioux, because of the still missing edition of his diplomas in MGH (the draft of this editions was destroyed during WW2). The following source summary refers to Regesta Imperii no 1005, Royal donation issued by Louis the Pious issued on the 8th of May 840 in Salz, concerning royal estates in modern Belgium.
http://regesten.regesta-imperii.de \ 
/index.php?uri=0000-00-00_1_0_1_1_0_0_1005

Even in non scientific resources like Google Books it's possible to link to indivudal pages of the source-edition. In Google by page-number of the printed original. Charter issued by Haroin in Wissembourg in 742, Liber donationum, no. 1 on page 7. in: Traditiones possessionesque Wizenburgenses. Codices duo cum supplementis. Zeuss (ed.) Speyer 1842, where id is a distinct and peristant identifier of this source edition.
http://books.google.com/books?id=yLoGAAAAQAAJ&pg=PA7

Unfortunatelly Gallica only permits link to the sequential number of the scanned pages, which is not compatible with the printed edition of the edition. If you wish to link to a certain page, you have the visit the actual webpage and copy the link. In other words, you can't construct the link with knowledge of the book-identifier alone, like Google.

My project Regnum Francorum Online aims at referencing historical events from the Merovingian and Carolingian period in time, space, and by agency, building a collection of meta-data about the events including links to indivudal source-documents if they are available online, taking advantage of the possibility to link to individual source-documents as described in the examples above. Referencing in time means that events are given a numerical estimation of time. In PHP the concept of Julian day count is implemented, and it's utilzed here. Referencing in space means geo-referencing places mentioned in the event to modern geographical concept of longitude and latitude as well as administrative affinity like country and province and other territorial divisions, distinctly identifying a placename. Referencing by agency means identififying individuals mentioned in the event as well-known historical persons, or if not possible, to individuals with a recognized name. Uncertainty in referencing must be taken into account.

Monday 16 November 2009

Historical GIS and Semantic web

The historical GIS application Regnum Francorum Online references historical events in time, space and by agency and link the events to source documents and literature available online. In doing so, the application becomes a GIS-interface to a growing number of both primary and secondary sources online. This also includes the huge collection of articles in the Wikipedia. To me, it has become evident that the Wikipedia will become a major source to all kinds of knowledge in the future. Thus it is of great importance to closer examine how the Regnum Francorum Online can be closely integrated with the Wikipedia.

Each article in the Wikipedia has an unique tag, together with an ID, which is necessary to build a permanent link to the article, according to the instructions in the Wikipedia. However, I have never seen a reference to the Wikipedia, including this ID. Not even in the semantic web project DBpedia, which has extracted and coded articles and their content into XML/RDF, including geographic information of such features. The DBpedia project uses the same unique tags as Wikipedia, but has also collected the geographic features of the GeoNames project, which are identified with a unique numerical ID. This project also geo-reference articles in the Wikipedia. Lately, the Wikipedia project has collected alternative identifiers of populated places, the local administrative units, which in the European Union are basic units of official statistics. These units are municipalities (e.g. commune, Gemeinde). The tags of geographic features in Wikipedia and DBpedia are often just the offical name. Alternative tags are also allowed, using redirects to the main article. Taking all this into consideration, a named geographic place within EU can be uniquely identified with a commonly knowned combination of country-code, administrative-code and place-name, e.g. Mommenheim in Rheinland-Pfalz, Germany (DE/07339037/Mommenheim), can be separated from Mommenheim in Alsace (FR/67301/Mommenheim). The tags in Wikipedia are Mommenheim (Germany) and Mommenheim,_Bas-Rhin (France) respectively. It is in this context of inter-linked resources between Wikipedia, GeoNames, DBpedia, and national agencies of statistics, both in HTML and XML/RDF format, I would like to make the geographic features of Regnum Francorum Online inter-linked as well, maintaining the administrative code and the Wikipedia-tags of geographical features. In almost all cases, an article about the history of a city can be found in the city article itself. There are a few exceptions, but in these cases the city article refers to the separate history-article, e.g article about Lendorf in Austria referring to a separate article about the roman municipium Teurnia.

In both Regnum Francorum Online and Wikipedia/DBpedia there are other geographic features as well, that is, institutions of the state/kingdom, latin regnum, county pagus/comitatus, march marchae, duchy ducatum; and church: bishopric, latin episcopatum and monastery, monasterium. The tags identifying these institutions in the Wikipedia are not as consistent and predictable compared to populated places. The articles about bishoprics are reflecting the current division of the catholic church, differing between ancient and current dioceses, e.g there is an article about Roman_Catholic_Diocese_of_Passau and another one about Prince-Bishopric_of_Augsburg, both containing historical information about the bishoprics respectively. In Regnum Francorum Online this is implemented as bishopsric/Augsburg and bishopric/Passau respectively. Furthermore, articles about the history of monasteries are referred to as the placename with the suffix _Abbey, e.g. Lorsch_Abbey, corresponding to monastery/Lorsch in Regnum Francorum.

When it comes to territorial subdivisions (institutions) of the kingdom, the confusion in Wikipedia becomes bigger. From a history of early medieval Europe perspective it would have been desirable with tags describing traditional divisions of provinces and kingdoms, and maybe that will come in the future. In the English Wikipedia there is a short listing of Carolingian counties containing 7 entries. This category more or less corresponds to the listing of Gau pagus in the German Wikipedia, containing 147 entries of Gau/Gaugrafschaft situated mainly in modern Germany, Austria and Switzerland. For modern France there is a corresponding category Liste historique des comtés français, in English, list of the historical french counties, referring to different articles like lists of counts, or to a historical region. Obviously these categories are still under development.

In Wikipedia there is also categories of historical events (battles, treaties) that are exiting to take a closer look at. Articles of such events are often well-written and with substantial content, e.g. Battle of Poitiers 732. In the category Battles involving the Franks there are currently 30 entries. Unfortunately, the implementation of events in Regnum Francorum are somewhat ambiguous at the moment, suffering from the original implementation as historical documents, rather than events. Later it became evident to me that a historical source-document can contain information about several events distinct from each other in time, space and/or by agency. Consequently source-information about the battle of Poitiers can be retrieved from the database in terms of place and time, not from searching a record of the battle of Poitiers directly.

Well, to summarize, in order to inter-link Regnum Francorum with other significant websites like Wikipedia and GeoNames, common identifiers of place and institution must be maintained. This is also the first step to a full integration into the future semantic web.

Saturday 31 October 2009

Working with spatial data in binary representation in MySQL and PHP

In an earlier post I showed how to store and retrieve spatial POINT geometries in MySQL and PHP. A POINT is a geometry that represents a single location in coordinate space. In this post we will be dealing with more complex spatial data such as LINESTRING and POLYGON, in real world representing for example rivers, roads, or territories in coordinate space. A number of Shapefiles (SHP) describing territories and division of kingdoms in early medieval Europe have been imported into MySQL. The geometries are drawn directly from the SQL-database using PHP-script, building the basemaps of the historical GIS-application Regnum Francorum Online. This article will discuss how this was accomplished. A Linestring is a one-dimensional geometry represented by a sequence of points, whereas a Polygon is a planar surface representing a multisided geometry, with a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the polygon. The first and last Point of the exterior boundary is the same, we say it's closed. Here we will only deal with polygons without interior boundaries.

In order to get these geometries on a computer screen map we must be able to somehow retreive the single points building these geometries and transform them from coordinates to pixels, and then use some graphics functions to draw lines and polygons. The following steps were used to achieve this task.

1. Define a minimal bounding rectangles (MBR) representing the boundaries of the map we will display on the screen and compare this geometry with the lines and polygons in the database and see if they should be selected. If this box is expressed in coordinate space we will be able to take advantage of functions comparing the spatial relationship between two geometries in MySQL. The MBR of the following Linestring reaching from one corner to the opposite, represents the map given its width, height and scale.
SET @g = GeomFromText('LineString($longitude1 $latitude1,$longitude2 $latitude2)')

2. The geometries holding the coordinates for rivers or territories has a spatial column of type LINESTRING or POLYGON and has been given the name SHAPE2 on all database-tables of this kind (layers). We will Select values from this column that Intersects the MBR defining the boundaries of the map (@g). We will then retreive the data as a binary string (BLOB). The Well-Known Binary (WKB) representation for geometric values is defined by the OpenGIS specification. WKB uses one-byte unsigned integers, four-byte unsigned integers, and eight-byte double-precision numbers. For fast access of the data, we use the Unbuffered Query in MySQL.
$query="SELECT AsBinary(SHAPE2), color FROM $layer WHERE MBRIntersects(SHAPE2,@g)";
...
$result=mysql_unbuffered_query($query);


3. In the code snippet above, we are retreiving two columns of each row in the database table. Unpack is a PHP-function that unpacks data from a binary string into an associative array according to a given format, in this case defined by the WKB. The first parameter of the geometry is the byte order (1 byte), the second, geometry type (4 bytes), and the third, number of elements (4 bytes). This sums to 9 bytes, and at this position (offset) in the binary string, we will find the number of elements of the given geometry type. The class-function drawLine deals with Linestrings and the function drawRings deals with Polygons. We will not deal any further with Multipolygon and Point geometries.

while($row = mysql_fetch_row($result)) {
$color=$row[1];
$g = unpack("Corder/Ltype/Lnum",$row[0]);
$type = $g['type'];
$num = $g['num'];
$offset = 9;
switch($type) {
case 1: // POINT
break;
case 2: // LINESTRING
$this->drawLine($offset,$num,$color,$row[0]);
break;
case 3: // POLYGON
$this->drawRings($offset,$num,$color,$row[0]);
break;
case 6: // MULTIPOLYGON
break;
}
}
mysql_free_result ($result);

4. Now we are ready to transform the coordinates to pixels on the map. We will begin with the Linestring-geometry. Offset is the position in the binary string ($row) where we will start to read. $numpts is the number of points in the Linestring. From the offset position in the buffer we start to unpack into the points-array (pts) eight-byte double-precision numbers (d) until we reach the end of the buffer (*). Next we assign an array geom, that will hold the resulting pixel-values of the points defining the Linestring. When reading the points-array, we know that the first number is the longitude and the second the latitude, on so forth. When reaching every second number we have a longitude/latitude pair that will be transformed using the gRef-function. The resulting pixel-values are assigned to the geom-array. Then we use the PHP graphics-function imageline to draw a line-segment between x1,y1 and x2, y2, in the given color.

function drawLine($offset,$numpts,$color,$row) {
$pts = unpack("@$offset/d*", $row);
$geom = array();
$lon=0;$lat=0;
$odd=true;
foreach ($pts as $value) {
$odd ? $lon=$value : $lat=$value;
if (!$odd) {
$this->gRef($lon, $lat);
$geom[]= $this->Xp();
$geom[]= $this->Yp();
}
$odd=!$odd;
} // end foreach
$max=($numpts-1)*2;
for ($i=0;$i < $max; $i+=2) {
imageline ( $this->img, $geom[$i], $geom[$i+1],
$geom[$i+2], $geom[$i+3], $color);
}
} // end function

5. Now we deal with the Polygons. In it's core this is the same function as the one for Linestrings, because polygons are Linestrings with the same start- and end-point.

function drawRings($offset,$numrings,$color,$row){
$off=$offset;
$x=$numrings;
while($x > 0) {
$h=unpack("@$off/Lnumpts",$row);
$numpts=$h['numpts'];
$off+=4;
$nump=$numpts*2;
$pts = unpack("@$off/d$nump", $row);
$geom = array();
$lon=0;$lat=0;
$odd=true;
foreach ($pts as $value) {
$odd ? $lon=$value : $lat=$value;
if (!$odd) {
$this->gRef($lon, $lat);
$geom[]= $this->Xp();
$geom[]= $this->Yp();
}
$odd=!$odd;
} // end foreach
imagefilledpolygon ( $this->img, $geom, $numpts, $color);
// If we wish to draw a border
imagepolygon ( $this->img, $geom, $numpts,
$this->bordercolor);
// Increase byte pointer
$off+=($nump*8);
$x--;
} // end while x
return($off);
} // end function

Wednesday 28 October 2009

Using MySQL spatial extensions in historical GIS

This post describes some aspects of using MySQL spatial extensions in the historical GIS application Regnum Francorum Online. Because I was already using MySQL to store evidence of historical events, the choise to try out the spatial extensions of MySQL that were introduced in version 4 was very close at hand. The geographical information system Regnum Francorum Online (RFO) is a MySQL database with a number of interrelated database-tables. The Events SQL-table holds information about the historical event (time, type of event, source-document etc.) and is linked to evidence of places and actors of the event. In turn, the Evidence of place is linked to entries in the Places-table which will be the example of this post. The Places SQL-table consists of the following columns: id, name, type, country, official geographic code and coordinates. The coordinates column (pt) is defined as the POINT-geometry type, holding a single longitude-latitude pair. Indexes are constructed for the columns id, name and a spatial index is constructed for the pt-column.
CREATE TABLE places (
id INT 8 UNSIGNED NOT_NULL AUTO_INCREMENT,
name BLOB(255) NOT_NULL,
type INT(3) UNSIGNED,
cc CHAR(2),
cog CHAR(8),
pt POINT NOT_NULL,
PRIMARY KEY (id),
KEY (name),
SPATIAL KEY(pt),
)

Now the Places-table is ready to be populated with data. In real life this has of course been a long process to collect data for more than 11,000 places currently in the database. From the beginning, name, country and coordinates were collected from the GeoNames geographical database. The GeoNames service is still the only service that I know of, which allows you to geocode adresses worldwide for free. Due to license restrictions in the Geocoding service of Google Maps API, retreiving coordinates for other purposes than showing a Google map is not allowed. Collecting official geographic codes is a work in progress, data is provided by state-agencies in the different countries. This far, geographic codes for France from INSEE has been added. This data is crucial for the distinct identification of places, and holds information about the administrative belonging of a certain place, e.g. Quierzy (a carolingian palace) has the code 02631, which can be translated into département Aisne, arrondissement Laon, canton Coucy-le-Château-Auffrique, commune Quierzy.
INSERT INTO 'places' VALUES (51,'Quierzy',3,'FR','02631',GeomFromText('POINT(3.1440379 49.5708778)') )

GeomFromText and Point are two MySQL-functions building the binary representation of the POINT Geometry-type. The more than 11,000 places in the FRO-database are now ready for very fast retrieval, based on SQL-queries that take the limits of the output map into account, together with other features of the map selected by the user.


Example: Coins of Pepin, Charlemagne and Carloman
Let's say we want to retrieve all evidence of mints in the Carolingian kingdom of the Francs. The evidence of mints are, for example, the different coins published in various catalogs. One such catalog is Les monnaies royales de France sous la race Carolingienne, deuxième partie by Ernest Gariel, Strasbourg 1884, available at Google Books, and the Internet Archive, containing all known coins at that time issued by king Pepin (752-768), Charlemagne (768-814) and Carloman (768-771). The evidence of the different coins with information of the mint on one side of the coin, and the name of the king on the other, is suitable to put in the Events SQL-table: no. 73. silver coin of type denier with inscription RP· | +TRI/CAS, that is, evidence of actor (RP = king Pepin 752-768), and evidence of place (TRI/CAS = Troyes, dép. Aube). The Evidence of place SQL-table is connecting the event with the place, holding ID:s of column-type UNSIGNED INTEGER(8) for the Events and Places SQL-tables respectively.


Follow this link to see this example in the real database application.


Retrieving data from a geometry column type POINT is very simple, use the MySQL-function AsText() to SELECT the values as a string-representation of the stored binary values, in the example above Troyes has the coordinates "4.0748 48.2975".
SELECT name, AsText(pt) FROM places
Individual values of longitude and latitude can be retreived with the functions X() and Y() respectively. However, we will take this a step further and show some other features of the spatial extensions as well. Values of a spatial enabled SQL-table can be selected using minimal bounding rectangles (MBR:s), in this case, all places with evidence of mint inside the minimal bounding rectangle defined by @g of the output map, will be selected. The first step is to set the MBR of the output map using it's upper-left and lower-right corner, expressed as two pairs of longitude and latitude. The $variables are defined in the PHP-environment.
mysql_query("SET @g = GeomFromText('LineString($longitude1 $latitude1, $longitude2 $latitude2)')");

Next we will compose the SQL-query for the evidence of all mint, within the selected time-period ($minyear - $maxyear). Placeindex is the name of the SQL-table holding the evidence of places in events, the event-type of minting is represented by the numerical value 163. The query uses GROUP BY to sum the rows by unique places of minting in the database, and to put the number of evidence in the cnt (count) column.
$query="SELECT
 placeindex.pid,
 places.name,
 X(places.pt),
 Y(places.pt),
 events.type IN(163) as mint,
 COUNT(*) as cnt
FROM placeindex, places, events
WHERE MBRWithin(places.pt,@g)
 AND placeindex.pid=places.id
 AND placeindex.eid=events.id
 AND events.type IN (163)
 AND events.maxyear >= '$minyear'
 AND events.maxyear <= '$maxyear'
GROUP BY placeindex.pid";

Now we perform our SQL-query and draw the map, which is saved as a PNG-image in a temporary-directory on the server. The member-function gRef() is where the fetched coordinates (decimal longitude and latitude) of the cities are transformed into pixel-cordinates in the UTM-projection of the current map. This is not explained in this post.
[PHP code (simplified) running inside our map-drawing class]:
$this->img = imagecreatetruecolor($this->width, $this->height);
$pic_coin = imagecreatefromgif ($_SERVER["DOCUMENT_ROOT"]."/pics/coin.gif");
$result = mysql_query($query);
while($row = mysql_fetch_row($result)) {
 $pid = $row[0];
 $oname=htmlspecialchars($row[1], ENT_QUOTES);
 $mint = $row[4];
 $this->gRef($row[2], $row[3]);
 $px= $this->Xp();
 $py= $this->Yp();
 imagecopymerge ($this->img, $pic_coin, $px-8, $py-8, 0, 0, 16, 16, 100);
} // end while
imagepng($this->img,$this->filename);
imagedestroy($this->img);

Conclusion
The conclusions are coming soon.