I meant to write part two of the this story last Thursday already, but then as so often other more important things crossed the plans, delaying its arrival a bit until today. Part three with the final release of software is still on too, and planned to happen in the coming days.
In Part one I described the technical endeavours necessary to turn the Kindle into a software platform that could be tweaked to fit our needs of a device running in kiosk mode to be used as an information device in the exhibition Design Real at the Serpentine Gallery . Once the technical feasibility of this was ensured, many design questions had to be solved.
The materials about the 43 objects in the show to be displayed on the devices were collected by around 40 students from 4 different art schools, and simultaneously edited on design-real.com by multiple editors. The process of editing was ongoing, meaning there was no deadline at which moment all the materials could be taken from the website and turned into a form that suited the display on the Kindle. A solution had to be found that allowed this process to happen repeatedly over the duration of the show, so the devices in the show would be more or less up to date with the content online.
It is a well known fact that the Kindle is a device that emulates the printed book. Its eInk display suites this purpose very well, as its low refresh rate is not a big disadvantage when flipping pages and the very low energy consumption outweighs the slightly slow browsing (in our tests, the Kindle was still running after 24 hours of page flips every 5 seconds). But it is not designed to display long, scrolling webpages such as the pages for each object on design-real.com. The Kindle is a page based medium, the web is not.
The challenge was to find a simple, automated way to turn each of these 43 webpages into a sequence of nicely laid out single pages. We faced additional issues as we have opted for a two column layout on the site that was automatically produced by a script.
The first idea of using the browser support for printing failed in oh so many ways, despite the promising sounding print and page features in CSS. Unfortunately it seems that the browser support for standard CSS features like page-break-after, page-break-before, page-break-inside, orphans and widows is mediocre at best on all major browsers. With increasing support for web fonts, the idea of using browsers, HTML and CSS as a layout mechanism for print did seem very promising – to bad that topic appears to receive so little attention from all the browser developers.
After this route was given up, we decided to give Adobe InDesign's XML support a go. The chapter about XML in the book Real World Adobe InDesign CS4 gives a very good introduction into the topic and is available for viewing online. InDesign allows the creation of template files in which different content types can be marked with different tags through a special tool that is part of the InDesign UI. An XML file then needs to be produced that matches this description. Content can be duplicated and repeated, unused parts of the template can be removed.
What sounds rather simple and straight forward unfortunately was not. After a lot of trial and error we found out that InDesign is very particular about white space, line breaks and paragraph separators. For example what appears to be a normal line break in the initially exported XML file from InDesign is often in fact a so called paragraph separator (U2029), and if this is not strictly followed in the XML used to produce a layout, things can get pretty messed up. Another issue was that the images used on the website came from all kind of sources and had very different DPI settings. There was no way to control at what size the images to be imported through XML would appear inside their template containers in the document, although these had a defined bounding box. Unfortunately InDesign does not offer a set of options to control content fitting.
To make matters worse, images often were imported too big to actually fit the entire page let and therefore did not let the page continue to flow. Such images seemed to break the whole document, as nothing after these images would flow into pages, and the images could not even be selected with the mouse and resized, as they were not visible, they were part of so called overset content. A logical solution here appeared to be Adobe's ExtendScript environment that offers access to objects in the document through scripting. But as these images were not visible, they also did not have bounds that could be modified. Even finding them in the DOM was difficult, for the same reason. In the end, scaling them multiple times using horizontalScale / verticalScale until they eventually would appear in the layout did the trick, as this luckily did not require them to have any dimensions. When they finally did appear, InDesign needed some time to pick up the reflowing of the document until the next image eventually broke the flow and the script had to be executed repeatedly again.
I mention these details here as I would like to prevent anybody else from having to fiddle around with the same problems as we had to. Adobe's suite of tools is powerful and some of these open features very promising, but I have yet to come across one of them that just works out of the box without any major quest for ugly workarounds. The whole Creative Suite seems terribly buggy for such expensive software, and each time I am faced with such a problem, the need for good open source software that can match Adobe's offerings in the design sector becomes more apparent.
So in the end, all oddities of this approach could be worked around, and an extension to the web app was written that formatted each page in this required XML format. The resulting workflow was pretty smooth, except for the repeated execution of the bug-fix script mentioned above. The big advantage of this approach was that it allowed manual tweaks of each layout before the InDesign document was turned into a sequence of JPEG images for the Kindle. Images could be scaled down to allow the layout the better fit a page, or page breaks could be forced to move a title to a new page, etc. For 43 topics / objects and an average of about 10 pages per topic, this process took about 5 hours to complete. This means that a book of 430 pages was formatted in a hand assisted but largely automated process. If the problems in the underlying software would be fixed, the process could be optimised a lot.
Attached two examples of resulting PDFs of the pages about the Shipping Container and the Broom. The pages of these PDFs then were turned into sequences of JPEGs at the native resolution of the Kindle DX screen. This format appeared to be loaded much faster than for example PNGs. But more about this in the upcoming Part three.