Some PDF thumbnails generating incorrectly

Would it make sense that some thumbnails are rendering fine and others not? Because this appears to be happening and I cannot explain why. All the PDFs are produced the same way in ABBYY Reader, as searchable PDFs.

This one is fine:

ok

These two are not:
garbage

They should all look kind of like the first image.

Any ideas, anyone? Beyond the ghostscript stuff above… is this an ImageMagick issue? What should I check?

I’d have to defer to the Omeka team, it doesn’t sound like the same issue as the original topic given some thumbnails do render.

Presumably it’s the red text that’s the problem in those 2 incorrect ones? If you have the ability to share them, the actual source PDFs would probably be helpful in determining what’s happening, or at least what should be happening.

(Oh, and I split this into a separate topic since it is a different kind of issue.)

1 Like

@jflatnes I can either share them with you individually or you can create a login for our site to log in (maoistlegacy.de) and have a look. For legal reasons I can’t share them publicly anywhere.

The two thumbnails with the red text aren’t rendering at all. They should have the same texture / look as the first thumbnail with the brown pages etc. The red text is in the original PDFs, so that’s irrelevant; it’s a document title that in the original is also red.

Oh, that does sound like it might be the same basic issue then… since the first one is clearly an embedded image in the PDF and the others look like they’re just text being rendered.

You can look at some things on your own related to those PDFs: you can run pdfimages -list to see what the format of the images embedded in the PDF are (that command comes from a package usually called pdf-utils or poppler-utils). Typically we’ve seen these kinds of errors with JPEG2000 images only (I believe pdfimages reports that format as jpx). You can also do a test conversion “offline”:

convert "test.pdf[0]" test.jpg

to see if you’re getting any error output from ImageMagick. If that’s happening it may also be already in your Omeka log, if you have logging enabled. Comparing a “good” and “bad” PDF on those two commands above should give you an idea of what’s happening, at least if it’s this kind of problem others have experienced.

Hi John - thanks for the quick response. When I run pdfimages that does seem to be the difference. Most PDFs have all jpegs, but these include several jpx images. Odd as we created them apparently the same way as all the others. Not sure what the heck happened.

In any case: to do test conversion offline, do you mean I have to extract the JPGs from the PDF and then in command prompt /terminal just convert them all? How does that work given that they are searchable PDFs???

Or really any solution to this issue would be helpful.

I just meant trying to run convert to see how/if it complains on the “bad” vs “good” PDFs.

A solution would probably be akin to what was described in the other thread, updating to a better version of Ghostscript. That’s basically always been the culprit when there’s issues with JPEG2000 in PDFs in my experience. Depending on your system/distribution it’s possible that there’s simply an update in the package manager you can do… otherwise we’ve had several people build from source to get a working version. Ghostscript happens to be a relatively easy project to build because they bundle basically all the dependencies.

Otherwise… you can extract the images themselves with pdfimages and convert those… you could manually replace the thumbnails, or upload the images as separate files… neither is really a better option in my opinion than fixing Ghostscript/ImageMagick.

So I’m trying to do this and after doing the Ghostscript make install suddenly I’m getting some weird errors on the Omeka side. I don’t know if it’s connected, I’m lost as to what’s happened!

Omeka has encountered an error

Zend_Http_Client_Adapter_Exception

Unable to Connect to ssl://maoistlegacy.de:443. Error #0: php_network_getaddresses: getaddrinfo failed: System error

exception ‘Zend_Http_Client_Adapter_Exception’ with message ‘Unable to Connect to ssl://maoistlegacy.de:443. Error #0: php_network_getaddresses: getaddrinfo failed: System error’ in /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Http/Client/Adapter/Socket.php:235 Stack trace: #0 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Http/Client.php(1073): Zend_Http_Client_Adapter_Socket->connect(‘maoistlegacy.de’, 443, true)

… etc
This happens when trying to open an item on the admin side, but I can still thankfully view all the items on the public side.

Help! @jflatnes @Daniel_KM @timmmmyboy – any ideas?!

Also, am I supposed to overwrite the old ghostscript files with the new ones? Right now there are two folders: ghostscript/gs and the latest version I just installed. Maybe there’s some weird conflict going on there?

I’m also getting this error when trying to connect to Scripto…

That error doesn’t really look like it’s related… it’s about your server not being able to connect over HTTPS to itself… presumably to connect to MediaWiki for Scripto purposes. I’m not sure what the issue is but I don’t think it’s really happening at the Omeka level… the error indicates some kind of problem with DNS resolution on your system.

As for Ghostscript… it really depends what you did when installing it. A “default” install should have gone to /usr/local so you’d have /usr/local/bin/gs . Generally I’d suggest keeping things that way (installing separately rather than overwriting). If you do install to a location like that, you need to edit ImageMagick’s delegates.xml file to tell it where to find your copy of gs (basically, replacing instances of just gs in the file with the full path to your desired version is the simplest option).

I didn’t think it would be related either, but I can’t figure out why it would have occurred right after I installed the latest version of ghostscript. It doesn’t make much sense; everything was fine this morning when I went it to look at the items. I cannot access them anymore from within the admin panel (any of them, not just some). I don’t know why the DNS issue would appear when I try to choose an item (not just scripto - mediawiki).

I’m considering just reverting to the backup I made 20 minutes before trying to install ghostscript.

You should have more to that error message which should tell you where the call is originating from, i.e. the rest of the traceback. But, I’d imagine it’s still Scripto… unless you have something else set up to make connections to your own server like that. It may just be checking status or something similar. Of course the problem probably doesn’t have anything to do with Scripto per se, it’s just that it happens to be something that will do a DNS lookup.

It’s also possible that simply restarting/reloading Apache/PHP/the server will resolve the problem.

This is the rest:

exception 'Zend_Http_Client_Adapter_Exception' with message 'Unable to Connect to ssl://maoistlegacy.de:443. Error #0: php_network_getaddresses: getaddrinfo failed: System error' in /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Http/Client/Adapter/Socket.php:235 Stack trace: #0 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Http/Client.php(1073): Zend_Http_Client_Adapter_Socket->connect('maoistlegacy.de', 443, true) #1 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/libraries/Scripto/Service/MediaWiki.php(666): Zend_Http_Client->request('POST') #2 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/libraries/Scripto/Service/MediaWiki.php(465): Scripto_Service_MediaWiki->_request('query', Array) #3 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/libraries/Scripto/Service/MediaWiki.php(160): Scripto_Service_MediaWiki->query(Array) #4 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/libraries/Scripto.php(219): Scripto_Service_MediaWiki->getUserInfo('groups|rights') #5 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/libraries/Scripto.php(88): Scripto->setUserInfo() #6 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/ScriptoPlugin.php(371): Scripto->__construct(Object(ScriptoAdapterOmeka), Array) #7 /var/www/vhosts/maoistlegacy.de/httpdocs/db/plugins/Scripto/controllers/IndexController.php(39): ScriptoPlugin::getScripto() #8 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Controller/Action.php(516): Scripto_IndexController->indexAction() #9 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Controller/Dispatcher/Standard.php(308): Zend_Controller_Action->dispatch('indexAction') #10 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Controller/Front.php(954): Zend_Controller_Dispatcher_Standard->dispatch(Object(Zend_Controller_Request_Http), Object(Zend_Controller_Response_Http)) #11 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Application/Bootstrap/Bootstrap.php(105): Zend_Controller_Front->dispatch() #12 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Zend/Application.php(384): Zend_Application_Bootstrap_Bootstrap->run() #13 /var/www/vhosts/maoistlegacy.de/httpdocs/db/application/libraries/Omeka/Application.php(73): Zend_Application->run() #14 /var/www/vhosts/maoistlegacy.de/httpdocs/db/admin/index.php(28): Omeka_Application->run() #15 {main}

@jflatnes you were right: restarting PHP did the trick. How odd…

Re: ghostscript. I installed it in where all the other libraries are under bin, but didn’t rename it to gs (it’s the default ghostscript with version number). Where would the delegates.xml file be located, do you know??

I honestly can’t find this file at all… it’s definitely installed, I checked the version number and found convert in the /usr/bin location. I did a grep in /usr/bin and on /etc and a few other places, can’t find delegates.xml. Searching via Google not helping. Any insight?

EDIT: found it. Under /etc not /usr/bin

EDIT2: changed the delegates.xml file to replace “gs” with “ghostscript-9.25” (name of the new version) and for some reason it didn’t immediately work. Perhaps I need to restart ImageMagick?

Can you run ghostscript-9.25 on the command line? Like, ghostscript-9.25 -v ?

No, clearly I’ve done something wrong. I followed these instructions but I don’t really know where I’m supposed to put what files.