Tuesday, July 16, 2024

EPUB format thoughts

 How hard can EPUB files be?

"EPUB is just HTML"! Hah!

I've got a fun EPUB ebook reader in the store; it's got two nifty features that, IMHO, all ebook readers should have (it can do an offline search of Project Gutenberg, and it's got a two-screen mode, so you can see both a critical image and the text that talks about the image in one spot).

Over the years, I've had to work around a lot of EPUB failures. Today's failure is thanks to the newer EPUB books that Project Gutenberg publishes.

Notably, each book seems to include a (pointless) <pre/> tag. The problem with that tag is that HTML does not support self-closing (void) elements. The Mozilla pages are super clear about that.

So my HTML renderer (which is just a WebView element) takes the <pre/> tag and reads it HTML style, like a <pre> tag that isn't closed. The entire rest of the book, which is most of it, is then displayed with pre-formatted lines. Because pre-formatted lines don't wrap (that's the point of the <pre> tag, the rest of the reading experience is mostly ruined.

SIGH.