Improve Curatescape performance

My omeka application is using Curatescape theme. The loading of the map is very slow - over 5 seconds for less than a thousand items. I’ve traced the slowness to the call to the CuratescapeJSON plugin. I’m not 100% sure if the problem is more in the PHP code or the database, but the top time suck I can figure out is repeated calls on this query:

SELECT `elements`.*, `element_sets`.`name` AS `set_name` 
FROM `omeka_elements` AS `elements` 
LEFT JOIN `omeka_element_sets` AS `element_sets` 
ON element_sets.id = elements.element_set_id 
WHERE (element_sets.record_type = 'Item' OR element_sets.record_type IS NULL) 
ORDER BY `elements`.`element_set_id` ASC, ISNULL(elements.order) ASC, `elements`.`order` ASC

An EXPLAIN on that query shows
Using temporary; Using filesort

I’ve played around with some mysql variables, but they don’t seem to help.

Since that same query is called for each item’s metadata load (about 3000 times for my current data set), and the result would be identical on each call, I was next going to try installing memcached, but I wanted to check in to see how this would get properly configured for omeka.

Or if anyone’s got ideas on improving curatescape load time, that would be great.

Thanks so much

Yeah… this call is just used to get a list of all the elements that are available for use on Items, and you’re right: the result is the same every time you call it. There’s no good reason it should be getting run over and over.

I’m not very familiar with CuratescapeJSON, so I couldn’t say for sure whether this is likely to be a problem with the plugin. It’s possible it’s something in the Omeka core that non-obviously causes all the extra queries, or it could be that some call in the plugin itself could be inside a loop when it’s supposed to be outside, or something simple like that.

Hi there. Feel free to poke around the plugin repo here:

https://github.com/CPHDH/CuratescapeJSON.

I’d welcome pull requests.

Thus far, none of the sponsored Curatescape projects have had more than a few hundred items, so we haven’t really pushed it too hard. I’m sure it can be improved.

Thanks. I’ll see what I can figure out.

That query is the result of running findByRecordType on the Element table. This gets run by our code in the core in a few places, one of which is the ElementText mixin which pretty much all the metadata code runs through.

However, the mixin is written to statically store the list of elements for a type, so it should only ever execute that query once in a request. So, if that’s really getting run 3,000 times to load the map, either something’s been changed or is broken with that “caching” of the elements, or there’s 3,000 different requests happening (since PHP doesn’t share that state across requests).

Thanks. I will clear the query log and re-run it. I pulled statistics from the first item’s queries, and I see that later item queries in the loop do not execute that same statement.

Just to close the loop here. I wound up getting rid of the loop inside the plugin. It’s not very elegant, but it dropped the load time from over 8 seconds down to under 1 second.

See https://github.com/PortSideNewYork/rhws-curatescape-json/blob/master/views/shared/items/browse.mjson.php

@davidalevine, FYI, we’ve been working on this issue and should have a release in the near future that greatly speeds up the JSON output for larger queries. Stay tuned.

In the meantime, feel free to post future Curatescape-specific questions and comments to our new forum at http://forum.curatescape.org/