Universal Viewer + IIIF Search

I’ve been trying to get this working and obviously I’m doing something wrong because it’s just not working!

I have documents with multiple pages. To reduce file sizes, I split the document’s .pdf into multiple files: 1 pdf file for each page of the document.

I’m using Universal Viewer to view the files. I’d like to be able to page through the files like I can through a series of .jpg files, but the paginator icons in UV are greyed out and the pagination bar at the top of the viewer is not visible. Can I page through multiple .pdf files in the Universal Viewer?

I want to be able to search the text of the .pdf files like in this example: Universal Viewer Content Search :: IIIF Technical Workshop

I have fiddled with various configurations and I can’t get the search bar to show up. Please help.

Here’s what I’m using right now:
Omeka S 3.2.0
Universal Viewer 3.6.4.5
Image Server 3.6.10.3
IIIF Server 3.6.6.7
Extract OCR 3.3.2.1
IIIF Search 3.3.2.1

I’m attaching screenshots of my Image Server and IIIF Server config pages, in case they’re useful.



(edited to add IIIF search module version)

@Daniel_KM do you have any recommendations?

IIIF specifications don’t manage pdf, but only three things : image, audio and video (and 3d in next version). So Universal viewer improves it to be able to display other formats, included pdf. So the pdf is not displayed according to specifications, but with a pdf viewer. So the search in a pdf is not done in a iiif-way, but like in any pdf.

To be able to search with iiif, it should be an image for each page. Then, the module IIIF Search will help you to search inside the viewer. But with this module, the text should be in a specific and simple format, extracted from a pdf attached to the item, pdf2xml, via the module ExtractOcr.

So to have search in Universal Viewer or in Mirador or any other iiif viewer, you have to:

  • install the modules IiifSearch and ExtractOcr
  • create an item with all images and the pdf attached

I have done an improvement of the module to use xml Alto as source of ocr for iiif search to avoid to extract it from a pdf. it will be released soon.