Diacritics not being matched when searching in full text through my pages

Hello everybody,
I’m having problems when trying to find text on the pages of my site when using the search bar for plain text. It seems the diacritics are the problem (the text is in french). For exemple : Encyclopédie doesn’t match but when searching with “Encyclopédie” I got the list of pages where the word appears.
What I’m I doing wrong ?

Thank you in advance

Are you saying that it works when you search with quotes around the term, but not without?

I’m not aware of any search issues with the Omeka S search and diacritics and accents. The basic way that text is searched and compared in the database is accent-insensitive, so your term should match with or without the accent on the e.

Hello,
I think the problem is the html scape characters for diacritics. I checked the fulltext_search table and the text captured in the column text contains all the special characters encoded with the html scape characters, for instance “mise en œuvre” appears like "mise en œuvre". So, I corrected the text in the table fulltext_search and now the search bar is working.
Nevertheless, I think the problem will present again when I’ll run the indexation once I’ll have more pages, and I’ll have to correct the fulltext_search table each time the indexation is done.

Ah, okay, that would make sense.

I think I’ve pinpointed what’s happening here and there’s something we can change in Omeka S to avoid this problem in the future. Can you tell me, do you have HTML Purifier enabled under Settings → Security, or is it disabled?

It is disabled because I was having problems with some tags.

OK. That’s what I expected. There’s a difference in behavior that can cause these escaped characters to be saved in the fulltext search index, and that happens when HTML Purifier is disabled.

We’ll put in a fix to un-escape those characters when saving them to the fulltext table, and also turn off some auto-escaping that the HTML editor is currently doing.

1 Like

That’s great !
Thank you for your help.