You are here: Home Documents Content Development Projects Talbert's Rome's World Database

Skip to content. | Skip to navigation

Talbert's Rome's World Database

Creators: Tom Elliott Copyright © The Contributors. Sharing and remixing permitted under terms of the Creative Commons Attribution 3.0 License (cc-by).
Last modified Jul 26, 2012 05:35 PM
Aligning entries in Talbert's database to Pleiades names will give us comprehensive representation of Peutinger map names in Pleiades.

The following description is an incomplete work-in-progress. Volunteers to assist with the work would be welcome.

Among the digital accessories[1] published alongside Richard Talbert's 2010 book Rome's World: The Peutinger Map Reconsidered (Cambridge) was a web representation of the content of the database Richard and I developed to support his research into the map. Here's the steps and tools I'm using to align it with Pleiades:

  • Save off the HTML full listing of the contents.
  • Write a simple XSLT to parse that HTML listing into Comma-Separated Value (CSV) format.
  • Load that CSV file into Google Refine, making sure to indicate explicitly that the encoding is UTF-8 (we don't want characters above base ASCII range borked).
  • In Google Refine, parse Richard's BAtlas citations for each TP feature into a form matching that used for old-style BAtlas IDs (e.g., where Richard supplies a string like "Abellinum 44 G4", we want to produce "abellinum-44-g4"). Regular expressions are our friend!
  • Download the file Sean Gillies created that crosswalks BAtlas IDs to Pleiades IDs.
  • Use the "cross" function in Google Refine to add Pleiades IDs (like 432618 for Abellinum) from the crosswalk file to the TP listing: this will tell us which place resources in Pleiades with which we should match each TP feature. For the example used here, the Pleiades resource we want is http://pleiades.stoa.org/places/432618.
 
Now we need to know if, for each match between a Pleiades place resource and a TP feature as described by Talbert, there's already a corresponding Pleiades name resource. If so, we want to update it with a citation of the TP itself and of Talbert's database entry. But if not, the precise string derived from Talbert's reading of the TP labels should be added as a new name resource, along with the appropriate citations. Given Google Refine's limited support for joins, we'll need to manufacture a single string that incorporates both the Pleiades ID and a regularized version of the name string itself. Something of the format {PleiadesID}:{NormalizedNameString} looks like it would work (e.g., 432618

  • download the latest names dump file (CSV) from Pleiades
  • use the "cross" function again to 

 

[1] The top-level entry point for the digital materials can be a bit hard to find. It's at http://www.cambridge.org/us/talbert/. Elijah Meeks blogged about the materials back in 2011.