Issue with Dspace Connector job

Omeka-S 4.0.1
Dspace Connector 1.6.1

All,

I am trying to rerun a DSpace Connector job from last May and, while the initial run pulled in 62 items from our Dspace, the rerun pulls in 0 items.

No errors, I can hit the Dspace API endpoint and get a proper JSON response, no prob.

Not sure where else to look.

I’ve tried Un-Doing the job, then creating a new job – no joy. I’ve deleted all Items from last May and tried to run new job – nuttin: 0 items pulled in.

Not sure what to do!

Advice appreciated,

Mark

Hi Mark,

Could you share some information on the jobs that pulled in zero results? If you go to the DSpace Past Imports page, and clock on the Job ID hyperlink, it should take you to a page with information about the job, API, collections, etc.

I’m specifically looking for the Args field, and the Log (if there is one). If I could get the Args info for the initial successful import in May and then one for a newer import that is pulling in 0 items, it would help to figure out what’s going on or what changed.

Thanks, Matthew,

Here is Job info to last May’s successful Pull:


Status
    Completed
Started
    May 2, 2024 2:18:31pm
Ended
    May 2, 2024 2:31:10pm
Class
    DspaceConnector\Job\Import
Owner
    mcyzyk@jhu.edu
Args

    {
        "csrf": "dd981e559b886b4280bf3d9110bc7ac6-626dfbd65d4193974f01eef32e36b12d",
        "ingest_files": "1",
        "itemSets": [
            "4667"
        ],
        "itemSites": [
            "42"
        ],
        "ignored_fields": "",
        "comment": "",
        "collection_link": "https://jscholarship.library.jhu.edu/server/api/discover/search/objects?dsoType=item&scope=1375a92e-f698-4796-b264-629cf74bc66b",
        "api_url": "https://jscholarship.library.jhu.edu",
        "limit": "1000",
        "test_import": "0",
        "newAPI": "1"
    }

Log
    [no log] 
Log (database)
    [no log] 

And here is from today’s re-run:


Status
    Completed
Started
    September 12, 2024 9:03:54am
Ended
    September 12, 2024 9:04:19am
Class
    DspaceConnector\Job\Import
Owner
    mcyzyk@jhu.edu
Args

    {
        "csrf": "dd981e559b886b4280bf3d9110bc7ac6-626dfbd65d4193974f01eef32e36b12d",
        "ingest_files": "1",
        "itemSets": [
            "4667"
        ],
        "itemSites": [
            "42"
        ],
        "ignored_fields": "",
        "comment": "",
        "collection_link": "https://jscholarship.library.jhu.edu/server/api/discover/search/objects?dsoType=item&scope=1375a92e-f698-4796-b264-629cf74bc66b",
        "api_url": "https://jscholarship.library.jhu.edu",
        "limit": "1000",
        "test_import": "0",
        "newAPI": "1"
    }

Log
    [no log] 
Log (database)
    [no log] 

Comparing/contrasting now…

Thanks for this–I’m not seeing any obvious differences but I’m currently running an import on my local test instance with your same settings and will see what I can dig up.

I thought it may have something to do with our Dspace version (I am pretty sure we recently upgraded), but drilling down into that JSON file from the API endpoint, it sure seems like everything is there.

I have not touched Omeka-S codebase or plugins between May and now!

Thinking…

Hi @mcyzyk ,

I wonder if it has something to do with ingesting the files and the types and/or sizes you have housed in DSpace. I tried running the import yesterday and I got a few errors related to invalid extensions and thumbnail generation.

Errors:
{
    "o:media": [
        {
            "file": [
                "Error validating \"\/server\/api\/core\/bitstreams\/f6e4a80f-3baa-4fcf-aedd-b28deea1b597\/content\". Cannot store files with the resolved extension \"mpga\"."
            ]
        }
    ]
}

and

2024-09-12T14:17:18+00:00 ERR (3): convert: width or height exceeds limit `/tmp/omekaeFW4Uw' @ error/cache.c/OpenPixelCache/3909.
convert: no images defined `/tmp/omekaorSptW.jpg' @ error/convert.c/ConvertImageCommand/3229.

The import ended up importing about 300 items, but then just hung and never finished. This morning I just kicked off an import without ingesting files and the import made it to 800 before receiving a 503 error. I think this error is unrelated so I re-ran the import and it this time it imported 3,500 items before the 503 error. I’m wondering if Omeka is just importing too fast for DSpace. Regardless, I think your original issue might just be the types and/or sizes of files you have in DSpace.

Hmmm I reverted my omeka back to the exact version and DSpace module version you are on, and imported the same collection with the same settings, and I was able to both successfully import and successfully update.

@fackrellj may be on to something, DSpace can get funky/hang up if you’re trying to import too many items at once, especially with attachments. But that shouldn’t be an issue for the 82 item collection your currently importing.

Maybe try fully uninstalling/re-installing the DSpace module? Perhaps something weird is happening within the DB import records.

Hold the bus!

Our server was out of disk space.

30 GB added – and my Dspace Pull is now working.

SORRY for the bother.

Mark

1 Like

Well, at least it was a simple answer! Glad things are working now.