You are here: Home Documents Content Development Projects Talbert's Rome's World Database

Skip to content. | Skip to navigation

Talbert's Rome's World Database

Creators: Tom Elliott Copyright © The Contributors. Sharing and remixing permitted under terms of the Creative Commons Attribution 3.0 License (cc-by).
Last modified Jul 26, 2012 05:35 PM
Aligning entries in Talbert's database to Pleiades names will give us comprehensive representation of Peutinger map names in Pleiades.

The following description is an incomplete work-in-progress. Volunteers to assist with the work would be welcome.

Among the digital accessories[1] published alongside Richard Talbert's 2010 book Rome's World: The Peutinger Map Reconsidered (Cambridge) was a web representation of the content of the database Richard and I developed to support his research into the map. Here's the steps and tools I'm using to align it with Pleiades:

  • Save off the HTML full listing of the contents.
  • Write a simple XSLT to parse that HTML listing into Comma-Separated Value (CSV) format.
  • Load that CSV file into Google Refine, making sure to indicate explicitly that the encoding is UTF-8 (we don't want characters above base ASCII range borked).
  • In Google Refine, parse Richard's BAtlas citations for each TP feature into a form matching that used for old-style BAtlas IDs (e.g., where Richard supplies a string like "Abellinum 44 G4", we want to produce "abellinum-44-g4"). Regular expressions are our friend!
  • Download the file Sean Gillies created that crosswalks BAtlas IDs to Pleiades IDs.
  • Use the "cross" function in Google Refine to add Pleiades IDs (like 432618 for Abellinum) from the crosswalk file to the TP listing: this will tell us which place resources in Pleiades with which we should match each TP feature. For the example used here, the Pleiades resource we want is
Now we need to know if, for each match between a Pleiades place resource and a TP feature as described by Talbert, there's already a corresponding Pleiades name resource. If so, we want to update it with a citation of the TP itself and of Talbert's database entry. But if not, the precise string derived from Talbert's reading of the TP labels should be added as a new name resource, along with the appropriate citations. Given Google Refine's limited support for joins, we'll need to manufacture a single string that incorporates both the Pleiades ID and a regularized version of the name string itself. Something of the format {PleiadesID}:{NormalizedNameString} looks like it would work (e.g., 432618

  • download the latest names dump file (CSV) from Pleiades
  • use the "cross" function again to 


[1] The top-level entry point for the digital materials can be a bit hard to find. It's at Elijah Meeks blogged about the materials back in 2011.