CSV import uri as resource identifier

On a particular dataset I have an URI (in the property owl:sameAs, ‘Hetzelfde als’ in Dutch) I want to use as resource identifier on CSV import.

This doesn’t seem to work, new items are created (existing ones aren’t updated)! Is CSV import only looking at the value of an item, not the uri?

FYI: if I change the owl:sameAs to a literal in the dataset and in the CSV import, then items are updated.

What you map the column as won’t matter when doing the update; that setting is only used for making the value that gets inserted and won’t get looked at when we’re doing identifier lookups. Of course what the already-inserted values actually are will make a difference, and in that sense the mappings used on previous imports can matter.

You have it right, the code that looks up those identifiers just looks at the “value” column of the value object, which is fine for a literal, but for URI types it’s a different column that holds the URI, and we’re not looking at that.

We can fix this… I’ll have to think a little on if I want to introduce a switch that changes what the type of identifier is (so it would be another dropdown in the Advanced Settings tab named something like “Resource identifier type”) or if I want to have us prospectively look for URI matches for any identifier. I’m leaning toward the dropdown.

1 Like

What do you mean with “prospectively look for URI matches for any identifier”? No switch and just look in the ‘value’ and ‘uri’ columns? I can imagine this option is slower than the switch option, because only one column is searched.

From a user perspective, the switch makes the import/mapping/update process more transparent. In linked data we prefer links over strings, URI’s over literals, so happy you’re thinking of including this in the CSV Import module!

Yes, that’s what I meant there, and yes it would be slower. I’ll mark you down in the “add an extra configuration for this” column.