We would like to ‘force’ a reharvest any time that the reharvest button is clicked (rather than only when an item record has been updated). We thought this was successful by commenting out the section of code that sets the startFrom date in the Harvest.listRecords method as well as commenting out the date check in the Abstract._harvestLoop method. This appeared to trigger a forced reharvest (resumption tokens were being found) but the data in the items was not changing. Specifically, I manually changed a title to see if it would be updated and the modified title remained after the harvest.
Furthermore, we also tried ‘forcing’ the harvest by actually modifying the datestamp in the item record but this still did not cause an update to the data in Omeka. It seems that a reharvest is only bringing in new items from the collection but not modifying the existing items. Can someone please verify this?
We need to have this functionality work because we are using some logic that updates a custom database based on what is contained in the dc:identifier field. In many cases, the item record may not change but the transformation that configures the dc:identifier field changed. Therefore datestamp would not be updated.
The harvester should update records that have changed, but it will only do so if the OAI record’s datestamp is more recent than it was when it was first harvested. You said you removed that logic, though, so it should be updating. Since you’re not getting duplicates, it’s correctly detecting the old records, so that’s not the problem.
Is it possible that there’s some mistake in the changes you made? I tried out a “fake” reharvest myself by editing the stored datestamps, just to confirm, and it correctly overwrote my changed text.
Hello, we have a similar problem. We are using Omeka for a metadata aggregator and we have harvested multiple collections, but now a problem has come up, we have tried to reharvest a collection from a DSpace repository, which has about 10 new records and it doesn’t work. Could you help us identify the problem? Thanks!
I tested this on my own install of the harvester and received the same error. I believe this is an issue with the server you are trying to harvest from and not the Omeka plugin.
You’ll see that indicates the same error you’re seeing logged. There’s two problems here. One: the error message just incorrectly uses until when we actually passed from as a parameter. Two, and more importantly, the date we passed, 2020-09-24, is a valid value for that parameter. The OAI-PMH spec requires all repositories to support values of the from parameter expressed as a date, as in this example.