I have an Omeka S (v4.1.1) installation with 1.8M items and 614K media and growing.
On certain places within admin I am really experience degraded performance. For example, the /admin/item page takes 8-9 seconds to load. On the front-end the performance is still ok, also because of a load-balanced MariaDB (v11) cluster on serious hardware and of course Solr (on yet another server). Still, I hope optimizations can be done to improve the performance of the back-end.
To this end I am looking for expensive queries which show up in the mariadb-slow.log, like the query for the /admin/item page (pretty formatted by me):
SELECT
r0_.id AS id_0, r0_.title AS title_1, r0_.is_public AS is_public_2, r0_.created AS created_3, r0_.modified AS modified_4,
r0_.resource_type AS resource_type_5, r0_.owner_id AS owner_id_6, r0_.resource_class_id AS resource_class_id_7,
r0_.resource_template_id AS resource_template_id_8, r0_.thumbnail_id AS thumbnail_id_9, i1_.primary_media_id AS primary_media_id_10
FROM
item i1_
INNER JOIN
resource r0_ ON i1_.id = r0_.id
GROUP BY
r0_.id
ORDER BY
r0_.created DESC, r0_.id DESC
LIMIT 20 OFFSET 80;
which takes over 7-8 seconds. The EXPLAIN of this query looks like this:
An improvement could be to add an index:
CREATE INDEX idx_resource_created_id ON resource (created DESC, id DESC);
Note: this index only covers this option out of all sort options. So you could argue this is not a scalable solution as you have to create a lot of indexes for all possible sorts (both ascending and descending). Bu as the standard sort on the /admin/item is “descending by created”, in practice this would be very beneficial.
With this index in use (*) the query responds in 0.0017 seconds! With an EXPLAIN of:
*) Strangely, after adding the index, the query optimizer didn’t pick the index, I had to “strongly suggest” to use the index. I also had to remove the GROUP BY (which isn’t necessary anyway because item:resource is always 1:1, no?):
SELECT
r0_.id AS id_0, r0_.title AS title_1, r0_.is_public AS is_public_2, r0_.created AS created_3, r0_.modified AS modified_4,
r0_.resource_type AS resource_type_5, r0_.owner_id AS owner_id_6, r0_.resource_class_id AS resource_class_id_7,
r0_.resource_template_id AS resource_template_id_8, r0_.thumbnail_id AS thumbnail_id_9, i1_.primary_media_id AS primary_media_id_10
FROM
item i1_
INNER JOIN
resource r0_ USE INDEX (idx_resource_created_id) ON i1_.id = r0_.id
# resource r0_ ON i1_.id = r0_.id
#GROUP BY
# r0_.id
ORDER BY
r0_.created DESC, r0_.id DESC
LIMIT 20 OFFSET 80;
I hope this kind of insight helps in optimizing the ORM PHP to increase performance. I’d appreciate advice on how to give insights or to directly improve performance.