Preserving HTML in Omeka Classic import

I’m trying to migrate an Omeka Classic site into Omeka S, and our metadata is full of HTML that needs to be preserved. I’ve tried the Omeka 2 Importer, but am settling on trying to instead use CSV Import in conjunction with Data Type RDF module.

I’ve created my resource templates, using the HTML data type for the fields that will have HTML in them. But when I import it uses the default text/literal data type instead. If I create a new item in the UI, I confirm that the HTML data type is the only option for the relevant fields, but the CSV Import module seems to not understand that.

Is there a way to import items from an Omeka Classic site and preserve the HTML in metadata fields?

After checking the database, the html is being imported correctly. For example I can find <br> in the values table rather than &lt;br/&gt;, which is what appears on the website. So it seems the data is being imported correctly, but Omeka converts those to html entities when its rendering them for the page.

I’ve searched the code base for both php’s htmlentities() and htmlspecialchars() and based on what I’ve found it seems more likely that htmlspecialchars() is being used. Maybe this is something a custom module or theme can overwrite?

Progress update (in case anyone needs this in the future): I’m able to get the metadata values to render with html by changing application/view/common/resource-values.phtml line 53 from calls to $value->asHtml() into $value->__toString().

The proper way to do this is probably in a custom theme where view/common/resource-values.phtml includes this change.

The basic issue here on the import is that the CSV Import module doesn’t look at the settings for templates; you have to pick the data type of an imported value yourself in the importer (its one of the options available using the “wrench” icon). However, I’m not sure the data type module you’re using adds its types to those selectable in the importer, so this wouldn’t necessarily help you all that much.

The change you made could cause some issues going forward: for one you won’t be escaping any values, even those that are not intended to be HTML, for another any data types that do something different with their HTML display won’t work properly.

If you’re comfortable in the database you might consider the alternative path of editing the data type stored for the values to use the “HTML” type from the module. Then you wouldn’t have to edit the core.

Thanks John! I think editing the database is probably the best way forward.

This topic was automatically closed 360 days after the last reply. New replies are no longer allowed.