Monday, 16 November 2009

Historical GIS and Semantic web

The historical GIS application Regnum Francorum Online references historical events in time, space and by agency and link the events to source documents and literature available online. In doing so, the application becomes a GIS-interface to a growing number of both primary and secondary sources online. This also includes the huge collection of articles in the Wikipedia. To me, it has become evident that the Wikipedia will become a major source to all kinds of knowledge in the future. Thus it is of great importance to closer examine how the Regnum Francorum Online can be closely integrated with the Wikipedia.

Each article in the Wikipedia has an unique tag, together with an ID, which is necessary to build a permanent link to the article, according to the instructions in the Wikipedia. However, I have never seen a reference to the Wikipedia, including this ID. Not even in the semantic web project DBpedia, which has extracted and coded articles and their content into XML/RDF, including geographic information of such features. The DBpedia project uses the same unique tags as Wikipedia, but has also collected the geographic features of the GeoNames project, which are identified with a unique numerical ID. This project also geo-reference articles in the Wikipedia. Lately, the Wikipedia project has collected alternative identifiers of populated places, the local administrative units, which in the European Union are basic units of official statistics. These units are municipalities (e.g. commune, Gemeinde). The tags of geographic features in Wikipedia and DBpedia are often just the offical name. Alternative tags are also allowed, using redirects to the main article. Taking all this into consideration, a named geographic place within EU can be uniquely identified with a commonly knowned combination of country-code, administrative-code and place-name, e.g. Mommenheim in Rheinland-Pfalz, Germany (DE/07339037/Mommenheim), can be separated from Mommenheim in Alsace (FR/67301/Mommenheim). The tags in Wikipedia are Mommenheim (Germany) and Mommenheim,_Bas-Rhin (France) respectively. It is in this context of inter-linked resources between Wikipedia, GeoNames, DBpedia, and national agencies of statistics, both in HTML and XML/RDF format, I would like to make the geographic features of Regnum Francorum Online inter-linked as well, maintaining the administrative code and the Wikipedia-tags of geographical features. In almost all cases, an article about the history of a city can be found in the city article itself. There are a few exceptions, but in these cases the city article refers to the separate history-article, e.g article about Lendorf in Austria referring to a separate article about the roman municipium Teurnia.

In both Regnum Francorum Online and Wikipedia/DBpedia there are other geographic features as well, that is, institutions of the state/kingdom, latin regnum, county pagus/comitatus, march marchae, duchy ducatum; and church: bishopric, latin episcopatum and monastery, monasterium. The tags identifying these institutions in the Wikipedia are not as consistent and predictable compared to populated places. The articles about bishoprics are reflecting the current division of the catholic church, differing between ancient and current dioceses, e.g there is an article about Roman_Catholic_Diocese_of_Passau and another one about Prince-Bishopric_of_Augsburg, both containing historical information about the bishoprics respectively. In Regnum Francorum Online this is implemented as bishopsric/Augsburg and bishopric/Passau respectively. Furthermore, articles about the history of monasteries are referred to as the placename with the suffix _Abbey, e.g. Lorsch_Abbey, corresponding to monastery/Lorsch in Regnum Francorum.

When it comes to territorial subdivisions (institutions) of the kingdom, the confusion in Wikipedia becomes bigger. From a history of early medieval Europe perspective it would have been desirable with tags describing traditional divisions of provinces and kingdoms, and maybe that will come in the future. In the English Wikipedia there is a short listing of Carolingian counties containing 7 entries. This category more or less corresponds to the listing of Gau pagus in the German Wikipedia, containing 147 entries of Gau/Gaugrafschaft situated mainly in modern Germany, Austria and Switzerland. For modern France there is a corresponding category Liste historique des comtés français, in English, list of the historical french counties, referring to different articles like lists of counts, or to a historical region. Obviously these categories are still under development.

In Wikipedia there is also categories of historical events (battles, treaties) that are exiting to take a closer look at. Articles of such events are often well-written and with substantial content, e.g. Battle of Poitiers 732. In the category Battles involving the Franks there are currently 30 entries. Unfortunately, the implementation of events in Regnum Francorum are somewhat ambiguous at the moment, suffering from the original implementation as historical documents, rather than events. Later it became evident to me that a historical source-document can contain information about several events distinct from each other in time, space and/or by agency. Consequently source-information about the battle of Poitiers can be retrieved from the database in terms of place and time, not from searching a record of the battle of Poitiers directly.

Well, to summarize, in order to inter-link Regnum Francorum with other significant websites like Wikipedia and GeoNames, common identifiers of place and institution must be maintained. This is also the first step to a full integration into the future semantic web.