CSV Import gives ORMInvalidArgumentException

Hello,

This question is a derivative to my post from a year ago which I haven’t been able to solve yet (BatchEdit and ORMInvalidArgumentException)

I am now trying to do a CSV import to revise a single metadata field on 3000+ items that are on the system. These items are of many different types and classes. However I started getting the error below after the first default batch of 20 is done successfully (Number of rows to process by batch in the Advanced Settings Tab)

Doctrine\ORM\ORMInvalidArgumentException: A new entity was found through the relationship 'Omeka\Entity\Resource#resourceClass' that was not configured to cascade persist operations for entity: DoctrineProxies\__CG__\Omeka\Entity\ResourceClass@00000000000005ef0000000000000000. To solve this issue: Either explicitly call EntityManager#persist() on this unknown entity or configure cascade persist this association in the mapping for example @ManyToOne(..,cascade={"persist"}). If you cannot find out which entity causes the problem implement 'Omeka\Entity\ResourceClass#__toString()' to get a clue. in /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/ORMInvalidArgumentException.php:114
Stack trace:
#0 /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(3474): Doctrine\ORM\ORMInvalidArgumentException::newEntitiesFoundThroughRelationships()
#1 /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(385): Doctrine\ORM\UnitOfWork->assertThatThereAreNoUnintentionallyNonPersistedAssociations()
#2 /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/EntityManager.php(376): Doctrine\ORM\UnitOfWork->commit()
#3 /data/ibali/omeka-s/application/src/Api/Adapter/AbstractEntityAdapter.php(442): Doctrine\ORM\EntityManager->flush()
#4 /data/ibali/omeka-s/application/src/Api/Manager.php(233): Omeka\Api\Adapter\AbstractEntityAdapter->update()
#5 /data/ibali/omeka-s/application/src/Api/Manager.php(136): Omeka\Api\Manager->execute()
#6 /data/ibali/omeka-s/modules/CSVImport/src/Job/Import.php(796): Omeka\Api\Manager->update()
#7 /data/ibali/omeka-s/modules/CSVImport/src/Job/Import.php(424): CSVImport\Job\Import->updateRevise()
#8 /data/ibali/omeka-s/modules/CSVImport/src/Job/Import.php(296): CSVImport\Job\Import->update()
#9 /data/ibali/omeka-s/modules/CSVImport/src/Job/Import.php(194): CSVImport\Job\Import->processBatchData()
#10 /data/ibali/omeka-s/application/src/Job/DispatchStrategy/Synchronous.php(34): CSVImport\Job\Import->perform()
#11 /data/ibali/omeka-s/modules/Log/src/Job/Dispatcher.php(32): Omeka\Job\DispatchStrategy\Synchronous->send()
#12 /data/ibali/omeka-s/application/data/scripts/perform-job.php(66): Log\Job\Dispatcher->send()
#13 {main}

If I increase the batch size to 200, I get the first 200 fine before the error creeps up again.

I have tried disabling some modules as suggested in my previous post - (Numeric Data Type, Bulk Edit) but that has had no effect. I can keep trying to swtich one by one, but I am wondering if there are any tips for what kind of modules might be causing this conflict? if it is that (there are a lot of modules…)

I also followed this link (Batch updating resources with a resource template may throw a UniqueConstraintViolationException · Issue #1690 · omeka/omeka-s · GitHub) but I think the version we have should have that fix in it already. I have tried the problem on both of our prod and dev versions and encountered the same issue:

Omeka S
Version	3.2.1
PHP
Version	8.1.18
SAPI	apache2handler
Memory Limit	1G
POST Size Limit	2G
File Upload Limit	1G
Garbage Collection	Yes
Extensions	apache2handler, bcmath, bz2, calendar, Core, ctype, curl, date, dom, exif, FFI, fileinfo, filter, ftp, gd, gettext, hash, iconv, intl, json, ldap, libxml, mbstring, mysqli, mysqlnd, openssl, pcre, PDO, pdo_mysql, pdo_pgsql, pgsql, Phar, posix, readline, Reflection, session, shmop, SimpleXML, soap, sockets, sodium, SPL, standard, sysvmsg, sysvsem, sysvshm, tokenizer, xml, xmlreader, xmlwriter, xsl, Zend OPcache, zip, zlib
MySQL
Server Version	5.7.42-0ubuntu0.18.04.1
Client Version	mysqlnd 8.1.18
Mode	ONLY_FULL_GROUP_BY, STRICT_TRANS_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER, NO_ENGINE_SUBSTITUTION
OS
Version	Linux 4.15.0-213-generic x86_64

and

Omeka S
Version	3.2.3
PHP
Version	8.1.2-1ubuntu2.15
SAPI	apache2handler
Memory Limit	1G
POST Size Limit	2G
File Upload Limit	1G
Garbage Collection	Yes
Extensions	apache2handler, calendar, Core, ctype, date, dom, exif, FFI, fileinfo, filter, ftp, gd, gettext, hash, iconv, imagick, json, libxml, mbstring, mysqli, mysqlnd, openssl, pcre, PDO, pdo_mysql, Phar, posix, readline, Reflection, session, shmop, SimpleXML, sockets, sodium, SPL, standard, sysvmsg, sysvsem, sysvshm, tokenizer, xml, xmlreader, xmlwriter, xsl, Zend OPcache, zlib
MySQL
Server Version	8.0.36-0ubuntu0.22.04.1
Client Version	mysqlnd 8.1.2-1ubuntu2.17
Mode	ONLY_FULL_GROUP_BY, STRICT_TRANS_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_ENGINE_SUBSTITUTION
OS
Version	Linux 5.15.0-105-generic x86_64

Otherwise, what would happen if i put the Number of rows to process by batch to something ambitious like 3000? Though it would be great to solve the problem with a more permanent solution.

Many thanks, as always,

Sanjin

What version of CSV Import are you using?

Hi @jflatnes.

We have CSV Import 2.3.2, which I think is the latest possible for ver 3 of Omeka S. We might only be going to Omeka S 4+ at the end of the year…

Sanjin

We made some fixes that were intended to solve the kind of problem you’re mentioning here in CSV Import version 2.6.0. Of course, as you say that requires Omeka S 4.

Without upgrading things, increasing the batch size is an option, but going all the way to thousands could be tricky. The reason we have the limit in place is that the libraries Omeka S uses to work with the database start to use a lot of memory and get increasingly slow when working with lots of records at once. You probably wouldn’t have big problems going into the hundreds for the batch size, but thousands could be an issue.

An option that would work would be to split your input sheet up into smaller pieces that can fit in a single batch at a larger but not complete size like 100 or 200 and doing several imports. You could also just test to see how your server handles increasing batch sizes if you want to try to push it higher (slowness is going to be somewhat unavoidable at really large sizes, but the practical maximum is going to depend mostly on your server’s configured PHP memory limit).

Thanks so much @jflatnes for that explanation and the steps to possibly try. I will definitely give it a go using the method you suggested.

Just out of curiosity, this seems to be only the case with revise, update, append options. We haven’t encountered this issue when doing a large batch of simply adding items…

See Import media and replace files for more information.

Thanks @ploc for your link to your issue, which seems like is persisting for me as well related to batch jobs…

Now it is also appearing the Bulk Edit module (using version 3.4.18 for Omeka 3.2.1). I tried to run a bulk edit on all items in a particular item set (over 1000) to change owners of items, and I immediately get an error which reads very similar to the one above:

Doctrine\ORM\ORMInvalidArgumentException: A new entity was found through the relationship 'Omeka\Entity\Resource#resourceClass' that was not configured to cascade persist operations for entity: DoctrineProxies\__CG__\Omeka\Entity\ResourceClass@00000000000006990000000000000000. To solve this issue: Either explicitly call EntityManager#persist() on this unknown entity or configure cascade persist this association in the mapping for example @ManyToOne(..,cascade={"persist"}). If you cannot find out which entity causes the problem implement 'Omeka\Entity\ResourceClass#__toString()' to get a clue. in /data/www/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/ORMInvalidArgumentException.php:114
Stack trace:
#0 /data/www/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(3474): Doctrine\ORM\ORMInvalidArgumentException::newEntitiesFoundThroughRelationships()
#1 /data/www/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(385): Doctrine\ORM\UnitOfWork->assertThatThereAreNoUnintentionallyNonPersistedAssociations()
#2 /data/www/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/EntityManager.php(376): Doctrine\ORM\UnitOfWork->commit()
#3 /data/www/omeka-s/application/src/Api/Adapter/AbstractEntityAdapter.php(487): Doctrine\ORM\EntityManager->flush()
#4 /data/www/omeka-s/application/src/Api/Manager.php(236): Omeka\Api\Adapter\AbstractEntityAdapter->batchUpdate()
#5 /data/www/omeka-s/application/src/Api/Manager.php(146): Omeka\Api\Manager->execute()
#6 /data/www/omeka-s/application/src/Job/BatchUpdate.php(30): Omeka\Api\Manager->batchUpdate()
#7 /data/www/omeka-s/application/src/Job/DispatchStrategy/Synchronous.php(34): Omeka\Job\BatchUpdate->perform()
#8 /data/www/omeka-s/modules/Log/src/Job/Dispatcher.php(32): Omeka\Job\DispatchStrategy\Synchronous->send()
#9 /data/www/omeka-s/application/data/scripts/perform-job.php(66): Log\Job\Dispatcher->send()
#10 {main}

I ran the same job on the copy our omeka S instance which 3.2.3 and this the error was:

Doctrine\ORM\ORMInvalidArgumentException: A new entity was found through the relationship 'Omeka\Entity\ResourceClass#vocabulary' that was not configured to cascade persist operations for entity: DoctrineProxies\__CG__\Omeka\Entity\Vocabulary@00000000000005d20000000000000000. To solve this issue: Either explicitly call EntityManager#persist() on this unknown entity or configure cascade persist this association in the mapping for example @ManyToOne(..,cascade={"persist"}). If you cannot find out which entity causes the problem implement 'Omeka\Entity\Vocabulary#__toString()' to get a clue. in /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/ORMInvalidArgumentException.php:114
Stack trace:
#0 /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(3474): Doctrine\ORM\ORMInvalidArgumentException::newEntitiesFoundThroughRelationships()
#1 /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php(385): Doctrine\ORM\UnitOfWork->assertThatThereAreNoUnintentionallyNonPersistedAssociations()
#2 /data/ibali/omeka-s/vendor/doctrine/orm/lib/Doctrine/ORM/EntityManager.php(376): Doctrine\ORM\UnitOfWork->commit()
#3 /data/ibali/omeka-s/application/src/Api/Adapter/AbstractEntityAdapter.php(487): Doctrine\ORM\EntityManager->flush()
#4 /data/ibali/omeka-s/application/src/Api/Manager.php(236): Omeka\Api\Adapter\AbstractEntityAdapter->batchUpdate()
#5 /data/ibali/omeka-s/application/src/Api/Manager.php(146): Omeka\Api\Manager->execute()
#6 /data/ibali/omeka-s/application/src/Job/BatchUpdate.php(30): Omeka\Api\Manager->batchUpdate()
#7 /data/ibali/omeka-s/application/src/Job/DispatchStrategy/Synchronous.php(34): Omeka\Job\BatchUpdate->perform()
#8 /data/ibali/omeka-s/modules/Log/src/Job/Dispatcher.php(32): Omeka\Job\DispatchStrategy\Synchronous->send()
#9 /data/ibali/omeka-s/application/data/scripts/perform-job.php(66): Log\Job\Dispatcher->send()
#10 {main}

with it being either class or vocab?

However, if I do it on 50 items at a time, I don’t get the same error.

Is this something else that was fixed in later versions of the Bulk Edit module, or do I actually not have something set up right in my Omeka S installation that I could change so long (mySQL mode)? Or is there the chance that a rogue other module is causing this issue and i must go on and off bit by bit to find it.

Any further guidance @jflatnes would be appreciated…

Sanjin

This same error can be caused by different things, so there’s not necessarily one common culprit. The iteration that ploc points to for example is related specifically to doing CSV updates that replace all the media on an item with a different set of media.

If you’re looking to rule out modules as a culprit I’d suggest you start by deactivating them all at once: if that doesn’t change anything then you know it’s not module-related (of course with the exception of the actual module you’re using in the cases where that’s appropriate). No need to do one-by-one investigation if the problem still happens with no modules at all.

This is unlikely to be related to SQL mode or anything like that. Some of these problems are indeed things we’ve fixed, and the CSV Import ones are a good example of ones that we have definitely made changes and fixed some of these situations since the versions that you’re working with.

Dear @jflatnes

Thanks so much for that advice and clarity around that. I took the plunge of switching off all modules and at that point both of the Bulk Edits and CSV imports worked fine. So I went through the process of trying to switch modules on until I encountered the problem. I still have to do a couple more tests but it seems like to not get the ORMInvalid ArgumentException error I had to do the following:

When using the Bulk Edit 3.4.18 module, I had to deactivate:

When using CSV Import 2.3.2 i had to deactivate:

I am going to confirm this, but wanted to thank you for that problem solving route…

I also realise these modules to be switched off are very similar to @felix’s post here: Doctrine throws ORMInvalidArgumentException on batch edit items