[ed. note — I'm posting some lightly edited posts I started a looong while ago but never finished/published. The first revision of this document had a `"last_modified": "2015-04-24T20:15:21.249Z"` and I may have tried to fix up the still-missing conclusion around `2016-08-27T06:52:33.923Z`?]
Is it possible to design excellent cross-platform applications? Is that a reasonable goal?
Let's take a step back, though. Forget cross-platform. The web is its own platform — surely building excellent web applications is a reasonable goal?
What makes the design of web applications challenging? — the design not the implementation —
From a design perspective, it's not the CSS box model, it's not browser bugs or lagging features, it's not DOM performance, it's not scrolling or fastclick or offline or anything like that.
Designing the user interaction of an excellent web app is hard, because the web is a cross-device platform. Its own platform, with its own idioms and conventions and user expectations, but accessible on everything from a TTY to the Hololens.
This is the challenge of designing an excellent web app.
Best practices
More soberly, we should say that designing an excellent web app, that works excellently whether a device fifty years old or fifty years away happens to browse upon it, is "the ideal". It is still hard today, with sketchpads and whiteboards and Photoshop, to really "mock up" an app while keeping in mind its alternate states and animated transitions and the like. If it's so hard to design an excellent app for a modern browser on a great tablet, now add "also a smooth workflow when loaded by LMB on a typewriter with a serial port", and every browser in between, to the requirements…?!
It is theoretically possible. A tremendously fun exercise. Would do The Web proud.
Yet utterly unrealistic for some apps. It's fairly easy to imagine a feedreader that works well across a century of different browsers on different devices. It's harder to imagine a photo library that works well on a device whose human interface is 35 lights and 25 toggle switches. (But if you have an Altair 8800 you're getting rid of, I'd sure love to try ;-)
Whether the Web encourages it or not, designing a new app for compatibility with a bygone user base is quixotic. Sure, imagine what your site looks like to text mode browsers if it helps you improve your app's machine (and therefore human) accessibility, semantic markup, yadda yadda… but at a certain point you have to stop. Your app cannot cater to every conceivable user, just as it can't cater to every conceivable use case.
So what should you consider when designing a web app?
Input devices
For as many times as I've brought up teletype-era devices, fact is the web was invented on a NeXT Computer, which had a "1120×832 pixel resolution, four-level grayscale" display. That's not bad! The web has always been about content, and has rarely been expected to deal particularly well with screens much worse than that. So let's assume the output devices through which users experience with our app's interface are relatively homogenous — there's big screens and little screens but they're pretty much… just… screens.
What about input devices? What sorts of input devices are there?
Countless, I'm sure. Let's ignore keyboards because BORING, and ignore sensors like because they sort of get used for both human input and ambient status type stuff CONFUSING. What sorts of pointer input should our web app handle?
This is interesting, because pointer interactions are inherently visual [in typical use]. So the input method doesn't just matter from an implementation perspective, but also the overall look and feel of our app. App design is of course more than just the visible surface, but it's sure a large part…. The interface layout needs to reveal and support all the manipulation opportunities available.
Even just among the pointing devices there's a huge potential range that sees at least occasional modern use. You've got everything from old standbys like the Space Navigator to newer like Wii-motes and Kinects and the Leap Motion. I appreciate new input devices like these, and enjoy testing them out as I have opportunity, but at the end of the day there's only a few "standards" that end up doing most of the work.
(There's a whole other set of input devices that we can't neglect either, though: those known as "assistive technologies" for those with particular physical needs. Even amongst mouse or stylus users, there's a range of ability that I've glossed over in the summaries below. But generally these devices still fall into the same pattern from your app's perspective.)
Against all the variety, consider these two questions:
- does the pointing device enable direct or indirect manipulation?
- is this manipulation precise or imprecise?
Put in those terms, the following four device families could reasonably represent all pointing devices. If our design supports all of them properly, and via the conventions users expect on the web, our app is well on its way TO EXCELLENCE AND BEYOND. Are you ready?
Mouse
A mouse, trackpad, trackball, eraser head, etc. is an indirect form of manipulation. That is, these kinds of input devices control a pointer that is not only physically separate, but also reacts abstractly to physical motion. Even disregarding the office prank made possible by any regrettable symmetry, the sensor mechanisms and acceleration algorithms allow a divide between the motion of the device, and the motion of the pointer it represents.
I say "allow", because this abstraction is useful. Because of it, the mouse is a precise form of manipulation. A relatively large motion of the mouse may manipulate its cursor around a tiny pixel. Or consider dragging an item on screen, but reaching the edge of the mousepad halfway through — not a problem if you can pick up the mouse and move it with the button still held and the cursor remaining in place. This is a form of precision too. Even the ability to "hover" with the mouse over a target, distinct from clicking it, lends itself to precise manipulation.
Touchscreen
Better known these days as: "multitouch".
Touchscreens offer direct manipulation. Your finger is the pointer for all practical purposes. There's not much abstraction, either your finger is pressing/moving somewhere right above the display, or it's not.
But this manipulation is very imprecise. Being a pointer implies being located at a "point", but the contact between your finger is actual an "area", roughly elliptical and covering — covering! — a relatively large amount of your target. A touchscreen may have pixel accuracy at a technical level, but it can be difficult to use it effectively (precisely).
Stylus
Now a stylus (or pen) is interesting. Though the stylus famously fell out of favor as multitouch became the new mobile idiom, it shines in terms of input characteristics.
With a digitizing screen, the pen's form of manipulation is both direct and precise. It does calligraphy while the touchscreen is daubing on war paint, yet gives a firm handshake where the mouse can only send telegrams. The tip of a stylus covers very little of the screen (if any, given parallax) but leverages the same counter-balanced motor control as writing and using chopsticks and picking out splinters.
There's a broad spectrum of stylus/pen input, from graphics tablets (which of course are indirect) to the old PDA-style resistive screens (which were hardly precise) but a good implementation even allows "mouseover" (hovering) and distinct clicking options with as much control as a mouse, combining with the tilt and pressure sensing which brings in the nuances of real writing as well.
In terms of direct/indirect and precise/imprecise the stylus would be the clear choice — if only I could remember where I left mine!?
Touchpad
Is there an input device for our final quadrant? One that is indirect but not precise? Indeed there is!
Many trackpads these days are perhaps better called "touchpads", not to be confused with those TouchPads. They are the "graphics tablet" analog to the touchscreen, primarily controlling a traditional "mouse" cursor but also capable of sending any sort of multitouch gesture you can squeeze within their perimeter. Gesturing on a trackpad is about as abstract as could be, but becomes quite natural in its own realm — which in my experience seems to gravitate towards system-level gestures (manipulating window state) more so than multitouch manipulation of visible targets.
Unfortunately, even though there's some plenty of use cases, as of this writing these indirect multitouch events are not exposed (as such in their raw form) to web apps. All the other manipulations above are — through mouse or touch and/or pointer event APIs.
Semantic input
How, then, do we design for the various forms of input?
First the bad news: that for a web app there is no simple "mouse" vs. "multitouch" .
But usually doesn't matter…
keypress, mousedown vs. click