PDF Text plugin - No Search Results

Hi all,

We have installed the PDF Text plugin and the pdftotext utility it requires. After telling the plugin to process existing files, I can see the text of all existing PDF files when looking at the PDF Text:Text tab for each file saved within any given item(s).

Searching for text contained in that text field yields no results, however. My understanding from reviewing all the previous PDF text posts is that the general search box will include the PDF Text:Text field, while Advanced Search will not.

I have tried changing the Basic Search parameters under Settings from Keyword to Boolean to Exact Match. No results.
I have even tried turning off Advanced Search, with no success.

What else am I missing?
Thank you for any insight.

David

Hi - did you try re-indexing the records?

Hi Alana, yes I did. Three times.

I will have IT reset the Omeka server today to see if that helps.

Resetting the server did not help. We’re baffled.

I’m sorry this is frustrating. Can you pull up your System Information (link at the bottom of the admin pages) and paste all that information here? There may be another plugin interfering with how search works.

Can you also make a new item, add a file, then manually enter some unique text into the PDFtext:text field on that file? And then let me know if that text comes up in your basic searches?

Hi Allana,
No problem, I expect quirks now and then. :slightly_smiling_face:

I added another new item and attached a small PDF to it. PDF Text automatically read the text of course. I then added a unique text phrase of my own and searched for it. No results.

I added a second test item and attached a jpg to it. I then added a unique text word to the PDF Text tab which was empty. Still no results in search.

Both test items appear in search when I execute a basic search for a word found in the item descriptions, just not for words in the PDF Text:Text field.

If you want to see the website itself, go to www.BarryCountyHistoryPortal.org. My test items all have Test in their descriptions.

Thank you for your assistance.


Thanks! What happens when you disable AvantSearch and AvantCommon and try everything again?

According to this:
https://digitalarchive.us/docs/plugins/avantsearch/#differences-from-omeka

AvantSearch doesn’t search file metadata, which is where the pdftext output is stored.

That did it!
I was not really using Avant anyway, I just hadn’t disabled it. And this functionality trumps it anyway.

Thank you for the help. Take care,
David

1 Like

Allana, can you spot any likely bad actors in my list (attached)?

I have the same problem as the OP. PDF text is being indexed successfully, but I can’t seem to find it by searching. I reindexed as you suggested.

Thanks for any ideas.

Nothing is jumping out at me. Try disabling as many plugins as you can temporarily (other than PDF Text), and then running the searches again? Be sure your site settings are as above, with “Files” checked in the “search record types”.

Thank you! Listen hard and you will hear me slapping my head. I could have sworn “Files” was checked, but perhaps I forgot to save the settings. Anyway it works now. Sorry to have troubled you.

No, it turns out that the File and Exhibit checkboxes aren’t sticky! That’s to say that the Search settings reverted to their defaults between yesterday and today. But today, having set them to what I wanted, they were preserved across a logout/login.

Ideas, anyone?