I hope this is the right location in the forums. I am using the most current version of Omeka 2.7.1 on a shared hosting solution with a LAMP stack. Omeka installed just fine and is running well! The chief problem is trying to get a few hundred collections, tens of thousands of items, and a few hundred thousand files into Omeka. I had written a python script to digest our current IIIF content create collections, items, and files (rescaled for the project) then push to the Omeka API.
This was working splendidly until we got a nastygram about exceeding our server usage times by 300% - 2,300%. I can’t figure out what I am doing that would cause so much server usage. I have now implemented some caching solutions so that I don’t have to query if an item exists every time. On my local server that wasn’t a big deal but trying to squeeze every cycle out, I re-thought that approach for our paid solution.
Historically I have gone with the bigger hammer approach and just pumping more info as fast as I can but I can’t transfer fast enough to make the hit a one-time thing. (permission vs forgiveness thing)
Feedback from the provider says the long-running SQL Statements in order are
- INSERT INTO
omeka_element_texts
- INSERT INTO
omeka_files
- INSERT INTO
omeka_keys
- INSERT INTO
omeka_items
- INSERT INTO
omeka_search_texts
SQL After that are only a minor portion of the processing time.
Any thoughts on how I can optimize these some? Or should I scrap the idea of the API and go directly to the database? I can’t tell if SQL Transactions are being used or to what level so it is possible keys are rebuilt every single insert instead of as a group? I have had that problem with other systems.
I am stumped here. As it is shared hosting I am having to go back and for the with their techs to get server stats so it has been slow going to get here. Thanks for any assistance you can offer.