Importing items with value annotations

Hi,

we’re migrating a very large and complex database to Omeka S. At the moment I’m working out the data model.

As I want to be able to modify vocabularies (including the deletion to classes and properties) and templates while exploring the possibilities, I thought it best to create vocabularies, templates and items by API with a script. That way I can always start again with a clean Omeka S installation when I decide to go back a step and take another way. This way everything is reproducible and I can use a version control system like git to track what I do.

With some effort I found out how to do this for vocabularies (you have to use json files, right?) and templates. This hint was a great help: Upload resource template via API - #6 by jflatnes As explained in that thread I wrote a script that uses the API to look up the internal IDs of the template properties and the resource_class before uploading the template files. This way I can upload template files by API that I have exported from an Omeka S installation.

I’m now trying to do the same for items. The problem is that I have items that use value annotations. What is the best way to import these items into an Omeka S instance? I assume csv files won’t work for this, since they don’t support value annotations, right?

I can use JSON and the API to create items, and I can get the property IDs for each property using the API. However, I’m not sure how to get the correct resource template ID for the import. How can I solve this problem?

When I manually create an item in Omeka S and look at it under http://xyz.de/omeka-s/api/items the items begins like this:

{
    "@context": "http://xyz.de/omeka-s/api-context",
    "@id": "http://xyz.de/omeka-s/api/items/7",
    "@type": [
        "o:Item",
        "schema:Person"
    ],
    "o:id": 7,
    "o:is_public": true,
    "o:owner": {
        "@id": "http://xyz.de/omeka-s/api/users/1",
        "o:id": 1
    },
    "o:resource_class": {
        "@id": "http://xyz.de/omeka-s/api/resource_classes/107",
        "o:id": 107
    },
    "o:resource_template": {
        "@id": "http://xyz.de/omeka-s/api/resource_templates/3",
        "o:id": 3
    },
    ...
]

The json lacks the resource template name. This is an issue because I don’t want to rely on internal IDs in my files—these IDs can change if I delete and recreate a template or move items between different Omeka S installations. Ideally for maximum flexibility I’d like to be able to upload json data that I’ve got from the API without having to modify it manually.

So I think this comes down to two questions:

  1. How do I import data with value annotation (without knowing the internal IDs that are specific to an installation)? This is the same as this question: How do I export items with value annotations from on Omeka S instance and upload them to another?
  2. Is it a bug that the json data of an item does not contain the label of the resource template?

Kind regards,
Martin

How do I import data with value annotation (without knowing the internal IDs that are specific to an installation)? This is the same as this question: How do I export items with value annotations from on Omeka S instance and upload them to another?

Use the @annotation key when posting an item to the REST API. You will need to know the internal property IDs. There’s no way around that. For example:

"dcterms:title": [
    {
      "type": "literal",
      "property_id": 1,
      "@annotation": {
        "dcterms:hasVersion": [
          {
            "type": "literal",
            "property_id": 28,
            "@value": "1.0"
          }
        ],
        "dcterms:hasFormat": [
          {
            "type": "literal",
            "property_id": 38,
            "@value": "Text"
          }
        ]
      },
      "@value": "My Item"
    }
  ]

Is it a bug that the json data of an item does not contain the label of the resource template?

It’s not a bug.

When building and re-building your installation, if you create things in a particular order, caching the IDs while you go, it shouldn’t be a problem to refer to existing IDs when creating or updating resources.

Thanks for the quick reply.

I’m creating things in a particular order. But even then IDs might change. I’m using templates for the value annotations (using the module Advanced Resource Template), so when fleshing out a particular template the list of templates I need to create before that (because I reference them) might increase, resulting in changing IDs. When I add or delete properties from a vocabulary IDs might change.

caching the IDs while you go

What do you mean? How do I do that?

The Advanced Resource Template module is maintained by third-party developer. Perhaps you’ll get more informed answers contacting them directly.

I don’t see any way around keeping track of the IDs, and when they change, update their references wherever needed. You can cache the IDs in memory or some other way. It’s not a trivial undertaking, but it may be your only option.

I’ve now written a script that takes my item json file, queries the API with the name of the resource class to get the internal ID and goes through the properties to get the IDs for them from the API. I put the json file into a directory with the name of the resource template, so I take the name of the directory to query the API for the ID for that.

It’s not perfect but I think that works sufficiently well for me.

Would putting the name of the template into the json that’s returned by the API be a feature request? Or is there a reason not to do that?

Hi,

In our recent major migration, we did very similar with our script calling the Omeka API first to load a ‘config map’ of all classes, properties, templates, custom vocabs, etc. In our script we referred to everything by their natural class/property/template name, but then mapped those to the actual ID just before calling the API to add/update items.

John.

calling the Omeka API first to load a ‘config map’ of all classes, properties, templates, custom vocabs, etc.

That’s a good idea. At the moment I call the API for every ID I want to have. That’s OK for now, but when I’ll get to importing our whole database that’ll speed up things considerably, I assume.

I didn’t understand at first why you wrote that. Maybe I’ve now realized what you mean: I was using the template json that Omeka S gives you when you click “Export” in the GUI. I didn’t realize that the json that the API gives you is different from that. The json from the manual GUI export gives you vocabulary_namespace_uri, vocabulary_label, local_name and label of a property. The API version just gives you the internal ID, no namespace or label. I guess you meant that you need to keep track of what that ID means.

There are more differences between the two versions. One difference that gave me difficulties is how the data types of the template properties are encoded: In the API version they are given as “o:data_type”, in the GUI export version as “data_types”.

Could I ask you to mention in the documentation that there are important differences like that between the two versions? That might have saved me the time to find out why the properties of my uploaded template didn’t have any data types.