The Transcript module supports this. Unfortunately some people are experiencing a problem with it. There may be a fault with the module, or it may be that it depends on a PHP component which is not always installed on servers. See these threads:
- Encountered an error while uploading a video to an existing item - Omeka S - Omeka Forum
- Transcript module Class ‘Locale’ error - Omeka S / Modules - Omeka Forum
The module looks good, though, as it also allows users to interact with the WebVTT transcription on screen I think. An example can be seen here: 2023 - Video de 5ta Caminata ante el Cambio Climático · AREPR