Thumbnails not auto-generating

I’ve loaded a couple of items (all PDFs) recently where the thumbnails have not automatically generated. Instead of creating images of the first page, there is a blank white square. This has never been a problem before, but seems to be happening more frequently in the last couple of weeks. Has anyone else encountered this problem?

I’d like to see if anyone has an ideas on how to resolve this issue? I have been saving individual images and overwriting the items that do not have thumbnails, but now that is not even working properly…

If you re-upload an “old” PDF that didn’t have the problem, do the thumbnails work or not?

If previously-working files now don’t, it’s likely something changed on your server (with ImageMagick, most likely).

If the previously-working files do still work, then it’s more likely that there’s something different about your newer PDFs. A type of problem people sometimes run into is that they have PDF-reading software installed on their server (Ghostscript), but the version they have doesn’t support the kinds of images their scanned PDFs contain. Usually the problem is JPEG2000 images, and the solution is updating Ghostscript.

I just re-uploaded an “old” PDF, and the thumbnails are not working for an older file. The server that hosts our Omeka instance was updated about a month ago, so that must be the culprit. I’ll contact them to see what needs to be changed on the server. (By the way, I tried the “Test” button on the ImageMagick Directory Path in Settings, and it does say the ImageMagick directory path works…)

Since this thread relates to our environment I figured I’d update here with some additional information. The error log is showing:

2018-07-10T19:14:46+00:00 WARN (4): Error output from ImageMagick:
**** ERROR: Unable to process JPXDecode data. Page will be missing data.

For what it’s worth we tested with other files and they all work so it’s specific to the way certain ones are created. That does sound like an encoding issue @jflatnes but we are running a very standard default CentOS 7 server which has the following:

GPL Ghostscript 9.07 (2013-02-14)
ImageMagick 6.7.8-9

Are those out of date and CentOS just isn’t pushing newer versions upstream?

It certainly looks like it’s a JPEG2000 issue with your Ghostscript version, judging by the “JPXDecode” part of the error output.

I’m not sure it’s even really a case of being out of date exactly, but more of a disagreement between the Ghostscript maintainers and the Red Hat maintainers (and therefore Centos). “Normal” Ghostscript ships its own bundled versions of libraries like the JPEG2000 decoder, and Red Hat is pretty hard line about not using bundled libraries.

As best I can tell, the result seems to be that the RHEL/Centos Ghostscript package uses the older/abandoned Jasper library for JPEG2000 support, rather than openjpeg2, which is what the current “stock” Ghostscript uses.

Where this puts you, I’m not sure… “Stock” Ghostscript is actually pretty simple to build since it bundles its dependencies, so that’s an option… they also offer pre-built binaries.

Such an odd issue, I went as far as downloading a pre-built binary of the latest ghostscript and that doesn’t work either. Then I tried ImageMagick 7 just to see if that would make a difference and now I still get no thumbnail but nothing in the error log either to go off. Gotta be something about how the PDF was created that either ghostscript or ImageMagick just balks at.

Do you happen to have a copy of the/a problem PDF that can be shared? I’m wondering if I can reproduce the problem.

It’s very likely to be Ghostscript that’s the “interesting” piece here, not ImageMagick. For PDF reading ImageMagick more or less just runs Ghostscript on the command line.

One thing to keep in mind if you’re trying to use a different Ghostscript for ImageMagick: if you’re installing it to a nonstandard path like /usr/local/bin or under /opt you may need to update ImageMagick’s delegates.xml file to force it to use the “good” Ghostscript by, for example, updating calls to gs to /usr/local/bin/gs. Making sure the appropriate folder is first in the PATH environment variable also probaly works fine but that can be annoying to set in the web context sometimes.

I just made the PDF file that I re-uploaded available on our website. Here’s a link - http://digitalarchives.sjc.edu/. It is the first item listed under “Recently Added Items” and the item name is [Untitled]. This was a file that I had loaded successfully before. I have not changed anything in my process of creating PDFs and loading them to Omeka.

Okay, that PDF does indeed have its images encoded as JPEG2000.

I was able to upload it and create thumbnails with no problem, so it’s almost definitely an issue with the Ghostscript install on your server.

This has me stumped. I had kept the updated Ghostscript at /usr/bin so no path changes should be necessary. Out of curiosity what is your server environment in terms of distro, PHP/MySQL versions, and ghostscript/ImageMagick versions?

My personal server I linked to is Gentoo, so it’s not exactly a template to work from for you I’m sure. It has Ghostscript 9.21 and ImageMagick 7.0.7-19, both from the package manager.

That said, I also tried it out on another more normal server which also had no problem and is just running Centos 6.9 (with the ghostscript 9.22 binary from ghostscript.com, though). The ImageMagick is actually an older one than yours (6.7.2-7), but it’s the Centos stock package.

PHP and MySQL really shouldn’t matter at all, but they’re also significantly different between various places where this is working fine.

I’m still fairly confident in my diagnosis that it’s just Ghostscript that’s the relevant piece of the puzzle here. I probably can’t say much more without being on the server in question.

This was the issue. The pre-built binary doesn’t work but building from source does. So something with both the stock version of gs on Centos7 as well as replacing it with the prebuilt binary, but if you build from source it works (I suspect it may have something to do with openjpeg2).

@kupke You’re good to go, tested in your account and it’s working.

That’s interesting… I know I’ve been able to use their binary in the past on Centos servers, but perhaps something has changed…

At any rate, glad to hear you tracked it down. There’s a lot of layers involved for something seemingly so simple, before you even get to Omeka.

I’m glad this problem was resolved. I just did my own test as well, and it does indeed generate the thumbnails correctly now.

We’re having the same issue, I believe, working on a VPS with CentOS 7 running and only started to run into problems recently. The thumbnails for the PDFs are generating the OCR’ed text rather than a JPG image. We have Ghostscript 9.07 running. I don’t really understand how to fix this based on this thread- can you explain what/where/how to “build from source”? Must this be done at command line or can I use the Plesk panel I typically use?

It does require the use of command line, install instructions are at https://www.ghostscript.com/doc/9.21/Install.htm#Install_Unix but essentially building from source means downloading the copy you need and then navigating to the folder you downloaded/extracted it to and running:

./configure
make
make install

I don’t use Plesk but it’s unlikely they would have a more automated way of doing this.

Thanks for the response and help! Just a quick question - which ghostscript would I install exactly? I see a list here:

https://www.ghostscript.com/download.html

Not sure which one I need to choose. So I guess I’d download then upload it to the server then login at command line and do as you said above - I have no idea where it resides on the server currently, so I’ll have to dig around.

It would be the first one, vanilla Ghostscript. Here’s the direct link to the latest source version https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs925/ghostscript-9.25.tar.gz

1 Like

Thanks or the quick response, really appreciate it!