Port Omeka Classic Scripto Content into Omeka S Scripto

We’re thinking about upgrading from Classic to Omeka S, but we need to be able to take our Scripto content with us - we have tens of thousands of transcriptions - does anyone know if that’s possible?

1 Like

There is no migration path for Scripto from Classic to S. While they both use MediaWiki for revision control, the project management, review process, and overall workflow are completely different, as are the code bases and the critical interactions between Scripto and MediaWiki.

That is not to say that it’s impossible, but any solution will have to be customized to your individual project’s needs. One way to make the transition is, in Classic, stop all new transcriptions, complete all unfinished ones, import all transcriptions, and record the IDs of all items that still need transcriptions. Then, in S (assuming you’ve successfully imported your items and metadata), start a Scripto project containing all the un-transcribed items, mapping between Classic’s item ID and S’s item ID.

Is there any way to import text into the transcription field of the new (Omeka S) installation of Scripto? I’ve been exporting the transcription metadata and the transcriptions themselves using Omeka’s API via python & bash (I put it into json to power a text search of the transcriptions on our landing page), so I can put the data into whatever format needed, and I know you can import into Mediawiki, but I’m not sure how to associate that text with Omeka files.

(Also, is there a way to leverage Mediawiki’s search capabilities? It seems silly to be using my own (far less featured) search when the Mediawiki’s search is perfectly adequate, and requires no import/export.)

If you’re using your own script, you can conceivably map data however you want. It’s not straightforward, though. I suppose you could use the Omeka2Importer module to import the items and metadata. Then create a Scripto project containing all the un-transcribed items. Then use your script to map the Classic MediaWiki pages to their corresponding S MediaWiki pages using their respective title algorithms, found in:

  • Classic: /libraries/Scripto/Document.php - Scripto_Document::encodeBaseTitle()
  • S: /src/Entity/ScriptoMedia.php - ScriptoMedia::getMediawikiPageTitle()

I still think my recommendation above is easier. It’s something we’ve already done when upgrading Scripto from a proprietary database to Omeka S. Again, consider the following:

In Classic:

  • Stop all new transcriptions
  • Complete all unfinished transcriptions
  • Import those transcriptions as item metadata using Classic Scripto’s import feature
  • Take note of the IDs of all un-transcribed items so you can map to their corresponding S items

In S:

  • Import items and metadata from Classic using the Omeka2Importer module (this will include the completed transcriptions that you previously imported)
  • Create an item set containing all items that are un-transcribed (Omeka2Importer automatically maps the IDs in its omekaimport_record table, so you can use that as a reference)
  • Create a Scripto project using the item set and begin transcribing!

As for leveraging MediaWiki’s search capabilities: I’m not sure I know what you mean.

Sorry it’s been a while! As for everyone, there’s been a lot going on. I think, given our situation, Omeka S isn’t going to help much. I think this has become a “searching Scripto transcriptions” question.

I think what would be most helpful would be a way to determine a correlation between omeka item and page to the scripto mediawiki page that contains the transcription. For example, for our project, this item:


corresponds with this mediawiki url:


so if I could tie MTIwNg.OTMwNDk to item: 1206, page: 93049, I could then use MediaWiki’s search API to to return results of a text search, transform that to an item/page, and provide all the results as links to omeka item/pages.

As it is, I’m pulling the transcription from every file’s omeka api (on a daily basis, because it’s a 20 minute operation), then searching the data on the client side in javascript - providing - at best - day old results (not to mention, I can’t provide immediate feedback to my users about completed items - users complain that they transcribe a page, go back to the item page to transcribe the next, and the one they just did is marked as “not transcribed”.

Enhanced mediawiki integration would allow better search results and more immediate feedback.

Does any of that make sense? or have I fallen so deep into the rabbit hole that I’ve gone mad?

I think what would be most helpful would be a way to determine a correlation between omeka item and page to the scripto mediawiki page that contains the transcription.

Please refer to the /libraries/Scripto/Document.php file, specifically the encodeBaseTitle() method, for the title algorithm Classic Scripto uses to map between an Omeka item and a MediaWiki page. You could use this same algorithm in your own scripts, for whatever purpose. Or perhaps I just misunderstand exactly what you are asking for.

Thank you - I’ve been using that decoder ring to great effect. Thank you thank you thank you!

This topic was automatically closed 250 days after the last reply. New replies are no longer allowed.