PDFs no longer uploading

I just got a Plustek scanner that uses ABBYY FineReader 12 Sprint for OCR and now my pdfs no longer upload. When I try to upload them, the server times out and the file does not attach to the item.

I previously scanned with a Kodak scanner and applied OCR with Adobe Acrobat Pro. I never had a problem uploading and did a test again today scanning a file with the Kodak that did, in fact, upload without a problem while the same item scanned with the Plustek didn’t.

I can go back to the Kodak, but it involves two additional steps and 2 or 3x the time.

Has anyone else found this and have you come up with a solution?

Any advice is appreciated,
Nancy

It might be that the files that are produced are too large for the browser upload threshold set by your server administrator. If that’s the case you can use the Dropbox plug to FTP them to the server and then attach them to the items.

Hi Sharon,

Thanks for getting back so quickly. That’s the strange part; the pdfs made with ABBYY FineReader 12 Sprint are smaller than the pdfs made with Adobe Acrobat Pro.

I also always use the Dropbox plugin and upload my files with FileZilla, so that’s not the issue here.

Just now, I uploaded a file that was 13 MB (Adobe), which uploaded just fine, yet the file that was 3.5 MB (ABBYY) wouldn’t upload and timed out.

I’ve checked the settings on the Plustek and there isn’t anything that I can find there to be causing this. I’m really leaning toward thinking that there is something with the ABBYY encoding that is the problem.

Thoughts?
Nancy

I’m going to include a link to the ABBYY file if you would like to give it a try. http://berkshireschoolarchives.org/plugins/Dropbox/files/Bulletin_1991-Winter-Spring.pdf

Well, I might have exhausted my ability to be helpful here. But, I was able to successfully attach the sample file you shared to an item in my current test installation (2.4.1) using the browser upload.

I do have my file upload verification disabled, so you might try that. It’s located under Security in the Settings page.

That file worked for me too, both with and without the file upload verification disabled. That’s just with the direct upload, though. It sounds like you’ve been consistently using Dropbox – have you tried doing the direct item creation? That’s the only immediate possible difference I can guess at.

The next step might be to turn on the error messages and see if there’s anything in the logs that sheds light on it.

Hi Sharon and Patrick,

Interesting that both of you could upload it. I have tried a direct upload as well as through the Dropbox plugin. What happens is it will grind away, attempting to create the item for roughly 3 minutes, then this screen appears in the same window:

Patrick - I did follow the directions to turn on the error logs and will email it as it’s quite extensive. Is there an email address I might use to send it along?

Thanks,
Nancy

You can send it to patrickmjchnm@gmail.com. Or post it up to someplace like gist. Sounds like there might be a lot in there for a lot of time. It might be worth just deleting that file, then recreating it as an empty file, then trying the process again to get a smaller file to look at.

Hi Patrick,

I emailed the file over on your gmail account.

Thanks,
Nancy

So, when I try to upload PDFs OCR’d with ABBYY FineReader 12 Sprint, my site times out and doesn’t add the file to the item. My site will upload large PFD files OCR’d with Adobe but no ABBYY.

Patrick and Sharon were both able to upload the ABBYY file which makes me think there is something encoded on the ABBYY PDF that my server or instance of Omeka isn’t able to handle.

Here is the error log; I’m wondering if anyone else has seen a problem like this and if so, do you use DreamHost for server space? The error log seems to be indicating my site is trying to delete the file either during the timeout or as it’s timing about.

2016-07-08T08:13:14-07:00 WARN (4): Omeka_Storage_Adapter_Filesystem: Tried to delete missing file ‘original/2f6f203e8089a5c5c008bac9910c8165.pdf’.
2016-07-08T08:19:33-07:00 WARN (4): Omeka_Storage_Adapter_Filesystem: Tried to delete missing file ‘original/d90788e404d4d5338d729493f3a8f59e.pdf’.
2016-07-08T08:19:33-07:00 WARN (4): Omeka_Storage_Adapter_Filesystem: Tried to delete missing file ‘original/69565974672ef8b1d09ad7396b3cfc34.pdf’.
2016-07-08T08:19:33-07:00 WARN (4): Omeka_Storage_Adapter_Filesystem: Tried to delete missing file ‘original/ceff62b3f04391b235c5579aa5890b22.pdf’.
2016-07-08T08:19:45-07:00 WARN (4): Omeka_Storage_Adapter_Filesystem: Tried to delete missing file ‘original/11c0d5923649468c429727fee09cd561.pdf’.

Any help is appreciated,
Nancy

Another possible clue to solve this problem. When I attempted to upload one of the ABBYY PDFs, this error message appeared, but only once. Is there a possible solution there?

Nancy

That message is not surprising, given what we know so far. The surprising part is that it only appears once. I would have expected the same message to appear for all the failed files. It’s basically saying that there’s a problem with how Omeka tries to process the file, which is a corollary to what we’ve seen in the logs (duplicated below for reference)

2016-07-08T08:13:14-07:00 WARN (4):
            Omeka_Storage_Adapter_Filesystem: Tried to delete missing
            file 'original/2f6f203e8089a5c5c008bac9910c8165.pdf'.
          2016-07-08T08:19:33-07:00 WARN (4):
            Omeka_Storage_Adapter_Filesystem: Tried to delete missing
            file 'original/d90788e404d4d5338d729493f3a8f59e.pdf'.
          2016-07-08T08:19:33-07:00 WARN (4):
            Omeka_Storage_Adapter_Filesystem: Tried to delete missing
            file 'original/69565974672ef8b1d09ad7396b3cfc34.pdf'.
          2016-07-08T08:19:33-07:00 WARN (4):
            Omeka_Storage_Adapter_Filesystem: Tried to delete missing
            file 'original/ceff62b3f04391b235c5579aa5890b22.pdf'.
          2016-07-08T08:19:45-07:00 WARN (4):
            Omeka_Storage_Adapter_Filesystem: Tried to delete missing
            file 'original/11c0d5923649468c429727fee09cd561.pdf'.

Sharon and Patrick,
This issue is really bugging me because there doesn’t seem to be any logical explanation for why it won’t upload. I have looked at the properties of both the Adobe and ABBYY files and the only difference is the version (1.6 Adobe vs 1.5 ABBYY), but that didn’t make a difference because both of you were able to upload the file.

Is it possible that there is a bug in the Seasons theme? It’s an off chance, but is your test instance using Seasons as the theme

Just a thought,
Nancy

Hi Nancy,

There is no computational reason why a theme would prevent a PDF from uploading. They are two completely different systems – one in the processing of content for the content management system, and one for styling and rendering of the pages as they are called by an end user.

–Sharon

I definitely empathize with the frustration of there not seeming to be a logical explanation, but there has to be one somewhere for us to find. So far, what it looks like is somewhere on the server either a file isn’t created when uploaded, or it can’t be transferred to Omeka from the temp storage.

We’ll likely have to call in your server admin, and ask them if there are temporary files being created from those pdfs when you try to upload them. There might be PHP or server logs that they have access to that could provide more info if there is nothing more than the WARNings in the Omeka logs you sent.

From the info so far, it seems like there is something about those pdfs that the server doesn’t like. At least that’s the most promising angle I see so far. Hopefully server and/or PHP logs in addition to the Omeka logs will get us closer.

Thanks for the ongoing support on this. I again contacted the server admin support folks at DreamHost and passed along the request for info on how to access PHP or server logs.

Next installment coming soon!

Nancy

Here is the current error message I’m getting:

exception 'Dropbox_Exception' with message 'The given path is invalid.' in /home/nanflo2/berkshireschoolarchives.org/plugins/Dropbox/helpers/DropboxFunctions.php:73
Stack trace:
#0 /home/nanflo2/berkshireschoolarchives.org/plugins/Dropbox/DropboxPlugin.php(95): dropbox_validate_file('Bulletin_1992-W...')
#1 [internal function]: DropboxPlugin->hookAfterSaveItem(Array)
#2 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Omeka/Plugin/Broker.php(157): call_user_func(Array, Array)
#3 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Omeka/Record/AbstractRecord.php(298): Omeka_Plugin_Broker->callHook('after_save_item', Array)
#4 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Omeka/Record/AbstractRecord.php(550): Omeka_Record_AbstractRecord->runCallbacks('afterSave', Array)
#5 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Omeka/Controller/AbstractActionController.php(229): Omeka_Record_AbstractRecord->save(false)
#6 /home/nanflo2/berkshireschoolarchives.org/application/controllers/ItemsController.php(91): Omeka_Controller_AbstractActionController->editAction()
#7 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Zend/Controller/Action.php(516): ItemsController->editAction()
#8 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Zend/Controller/Dispatcher/Standard.php(308): Zend_Controller_Action->dispatch('editAction')
#9 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Zend/Controller/Front.php(954): Zend_Controller_Dispatcher_Standard->dispatch(Object(Zend_Controller_Request_Http), Object(Zend_Controller_Response_Http))
#10 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Zend/Application/Bootstrap/Bootstrap.php(105): Zend_Controller_Front->dispatch()
#11 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Zend/Application.php(384): Zend_Application_Bootstrap_Bootstrap->run()
#12 /home/nanflo2/berkshireschoolarchives.org/application/libraries/Omeka/Application.php(79): Zend_Application->run()
#13 /home/nanflo2/berkshireschoolarchives.org/admin/index.php(28): Omeka_Application->run()
#14 {main}

Might the answer be in here somewhere? Sorry for the formatting - I couldn’t get the text smaller.

Nancy

This looks like progress, at least as measured by the fact that I’m more confused. :wink:

Somehow it thinks that the path to the dropbox files is invalid, which is definitely weird if you’ve been able to go through the same process with different files.

Could I ask for another copy-paste – this one from the system info page on Omeka’s admin side. There’s a link to it at the bottom right of the admin pages.

No problem. It’s a new system for everybody.

When you’re posting big chunks of log text or code or something like that and you want it to just be shown as text, you can select it in the editor and click the </> (preformatted text) button in the toolbar.

Good to know - thanks!

Okay, I still don’t know why this worked, but after reading about conflicts with certain plugins, I deleted a number of plugins to see if they were causing the upload issues with certain pdfs. I also uninstalled, deleted and re-installed the Dropbox plugin and it works!

I will probably try to add plugins back, one-by-one, to see if the difficulties return. Until then, problem solved (fingers crossed).

Thanks to everyone, esp. Patrick, for all your help and input.

On to the next issue [smile]!

Nancy