Import a lot of media in existing items

Hello,
I’m using Omeka S version 3.2.3.
I would like to complete some item sheets, which were already created a long time ago by another team, with pdf media that I have on my workstation.
My first intention was to create a php script to modify the tables in the Omeka database directly, but on reflection, this doesn’t seem to be a good idea.
I’ve seen that you can batch import media into Omeka S. I’m a real novice and I’m still having a bit of trouble with the Omeka vocabulary. That’s why I’m having trouble understanding the requests that deal with this problem in the forum.
I could make a CSV file with one column containing the item ID (the internal ID of the database), another with the title of the document and another with the path of the file on the server (there is no other information to provide on PDFs).
Is this enough to carry out the import and how do I go about it?
Thank you in advance for trying to help me

Hi @vince2corte ,

I think you’re on the right track.

I would recommend using the CSV Import Module (Omeka S - CSV Import) and the File Sideload module (File Sideload - Omeka S User Manual).

Together these modules will let you upload the PDFs to your server, reference them in your CSV, and then import them as Media attached to the associated Item.

At the risk of being that guy, have you looked at the manual for the module? It walks you through the steps reasonably well.
https://omeka.org/s/docs/user-manual/modules/csvimport/
(Specifically the Import Media section, but the whole thing is very good.)

But also, one short version of the important parts of that is: set the Import Type to Media on the first screen, set the Action to Append in the Advanced Settings tab of the second screen, and pay attention to the choice about what to do if a resource isn’t found (“Action on unidentified resources”).

1 Like

Thank you very much. I will try do use CV import and File Sideload.
I’ll send you a feedback as soon as possible.

Thank you for your advice. Yes, I went through this documentation, and as I’m a total novice and the vocabulary isn’t easy (English isn’t my native language :wink: ), I preferred to try the forum with more precise advice.
What’s more, I have very limited time, so I’m aiming for maximum efficiency.
But I’m certainly going to need to take a serious look at it anyway.

Heard! I tried to write the recommendation of the manual gently, because I hate it when people just respond with RTFM. And yes, reading in a second or other language can be hard. I do think this is a case, though, where for me at least describing what you need to do would just be restating the manual. It’s quite good for this module.

Hello everyone! A little feedback and a question:
I’ve managed to work out how to import my 3500 pdf files! It wasn’t very complicated, but there were a lot of tiny details to sort out so that things went smoothly. Thanks for your advice.
My question is this: the files should only be accessible to a certain group of users. Is it possible to do this at import time? If not, can we do it in batch, knowing that if I have to retrieve the ids of the media files from the database, I should be able to do it? (3500 files to modify! it would be terrible not to be able to do it automatically).
Thank you in advance for any suggestions you may have.

Just a final little feedback: I was finally able to do everything I wanted:

  • import pdf files using the “import csv” module. It’s actually quite simple once you get past the vocabulary barrier
  • automatic generation of thumbnails for pdf files with the “Create missing thumbnails” module (note, not the official version: the “biblibre” version). In version 3.23, we had to correct a bug in the “vips” call: replace the “[255 255 255]” parameter with “[255 255 255 255]” in the “Vips.php” file. Not an easy one to find!
  • assigning groups to set media viewing rights: there’s a way of doing this by group in the Omeka S interface. All you had to do was use an unused Dublin Core field to set a value that would be used only to search for media (otherwise it wouldn’t be easy to isolate 3,500 media records from 60,000).

Thanks to those who helped me.