natevw proudly presents:

a glob of nerd­ish­ness

powered by work over time.

New-ities

Nerdly status report commenceth.

Nexus 4

Probably the biggest recent change was selling my iPhone 4 and replacing it with a Nexus 4.

Nexus 4 self-portrait via Pandigital tablet's front facing camera

No, not the cheap old off-brand tablet which it is taking a picture of.

As I found out when I got my Nexus 7 last year, modern Android is an excellent operating system. Seriously. The browsers [yes you have options], the extendable platform, the multitasking and notifications, even the visual design (especially the visual design?!) all put iOS to shame. It came down to: Google creeps me out, Apple pisses me off, and both big evil corporations are snug in bed with our benevolent shadow government (ALL HAIL). So I might as well enjoy a surveillance device with the superior user experience.

Unlike Apple, Google's not afraid of shipping products that do useful things with all the data they collect for the NSA, but I'm still trying to avoid most of those. Plus, many of the core pieces (calendar, contacts, photos) at least in theory will let me connect them to my own services via my own plugins. I'm eager to dabble in that when I find some time.

Wristmap

Speaking of time…

Wristmap showing my current location

To celebrate finally having a pocketable Pebble proxy (my larger Nexus 7 doesn't get cell data and iOS pretty much sucked at all things Pebble), I got my Wristmap watchapp far enough along to share with others. Plenty more improvements to make, but it's already gotten some really encouraging feedback on the Pebble forums. You can read more details and follow along there.

net.ipcalf.com

Thanks to an impromptu lunch discussion about SIP with Lance Stout, I figured out how to extend my barebones public IP address site with a very handy new subdomain!

Local IP address being displayed by net.ipcalf.com

The new net.ipcalf.com subdomain uses WebRTC to find not your public IP address, but your network IP address — i.e. the LAN address you need for doing local testing. So it works great in Chrome and Firefox. Which both work great on my AWESOME NEW PHONE by the way :-P

Apparently it's already been making the rounds within the web security community due to how simply it shows the information disclosure, but in my opinion being able to set up peer-to-peer connections inside a LAN is a really cool feature of WebRTC worth the tradeoff. And not just as a handy web-development tool. (PeerPouch is coming.)

Greenhouse progress

It's not there yet, but the fish-feeding/greenhouse-monitoring equipment I mentioned in my last post is coming along fairly well. Over the weekend I got many of the additional sensors coded up and an improved food spout attached.

Demand feeder spout part amongst electronics prototype

I've got to re-architect the Arduino side of things to avoid some long-blocking sensor reads, and shore up the Raspberry Pi side which has not been terribly reliable since an unfortunate wire-wrap accident involving a toddler. And then get it on the web.

Upcoming: Solar PV install!

This month will be busy though, as the next two weeks' schedule involves making this happen:

Diagram of my upcoming solar system (design by my dad)

There may be a floating webcam involved.

comments

On vacation

On vacation, I…

Worked

I'm very grateful for clients that I can help from wherever I have a bit of occasional internet connectivity.

Bought a cool domain but just redirected it to some other domain for now

'nuff said

Submitted some talks to Cascadia.js

Though it looks like I wasn't the only one. The conference is in conjunction with a CouchDB conference too, so maybe I should keep submitting proposals!

Got "wristmap" mostly working

wristmap showing a B&W map of Richland on the Pebble wristwatch

My "Hello World" app for my Pebble watch is called wristmap and it pretty much just shows a map of where you are thanks to Stamen's Toner tileset. You can zoom with the buttons and that's about it for now. Something small I should be able to ship, though.

Also contributed some patches to a libpebble fork that's helping me test httpebble usage via my Mac (instead of my Nexus 7) during development.

patches to libpebble

I suspect in practice there's some optimization that will be needed before deploying this. Right now the watch has to make about 50 HTTP requests per screen refresh which Isn't Ideal™. I've been studying Huffman coding and might like to try my hand at an implementation ;-)

Continued progress on Microstates

…specifically, I'm starting to see signs of life with the node.js module I'm putting together to wrap the Raspberry Pi RF24 library.

Didn't forget Fermata

I have an http client named Fermata that I use all the time but haven't urgently needed to improve much lately. Finally found a bit of time to think through the design of some lingering "nice-to-haves" before calling it a stable 1.0 release.

Fed the fish

Prototype fish feeder in place (photo by Hjon)

Before we left, by which I mean still going at it about 10 minutes "before we left", I was testing sketches and flashing firmware and forwarding ports so that I could start using my greenhouse controller remotely. Thanks to some coding on the road a willing mechanical engineer friend back in town, we were able to re-connect the parts in place, find RF reception and FEED THE FISH from a thousand miles a way. Yay internet!

Relaxed

I got a lot of time in with both immediate and extended family too — whoohoo! In case you were wondering.

comments

Status update

It's been hard work to work hard lately; my timesheet says I still haven't done as much of it as I should have. Freelancing ain't free.

Room to Think

Room to Think's coworking space is now closed, and the formal organization is dissolving. We hadn't found enough demand for full time membership, and the time and energy available for solving our sustainability problem had run out. I spent most of June overthinking the situation, talking with each of the members, helping figure out plans, etc.

The Meetup group is still going strong though, and I'm in a good new [although not to me!] office with several of the same friends I had at the coworking space.

Dust starting to settle in my new office

D3.js book

I quit the book project towards the end of June. I expected it would take a significant amount of time, but I didn't account for just how. Staying up late working on the book was mentally and even emotionally draining, and in conjunction with the Room to Think happenings pretty much destroyed my ability to get paid work done during the day. Still sad to not have my name in print on a topic like D3, but I've been making too many other investments this year and could not afford this one.

I'd written the first third of the first three chapters, let it rest/rot for a month while worrying about more pressing concerns, and am now starting to consider the best way to share something useful out of that effort.

"Projects"

Have been delayed by at least two months, especially the ones that were already several years behind. Expect further delays until I am independently wealthy or irreparably destitute or get settled into some semblance of a productive daily/weekly routine which doesn't include book deadlines.

Life

Is a blessing. The kids have been growing, the garden/fish/list-of-exciting-household-projects has been growing, I have been growing. To what end I don't know, but here's to the flow.

comments

Sandboxing JavaScript in the browser

I don't always run untrusted code in my webpages, but when I do, I prefer it to be sandboxed.

Unfortunately, many people have asked "how is it possible?" but this and this and this and this and this and this and this and this and this … don't have any solid "yes" answers; it seems really hard to sandbox JavaScript code unless it can be run asynchronously. Even then, code may have access to unwanted features.

However, I might have found a way!

Background

There are a number of reasons it is unsafe to run untrusted JavaScript code in the browser. Any code you eval will have:

Most of these have to do with state and scope. A sandbox isolates the state and limits the scope that can be accessed. By combining a few tricks, it seems we can sandbox JavaScript code in the browser to the point where the most evil it can do would be to halt things completely.

evel, a safer eval

Evel Knievel jumping the Snake River in an EVEL.js rocket bike

(Figure 1: mad thanks to Mike West for the rad mascot!)

I've shared evel, on github as a drop-in JavaScript library that provides evel() and evel.Function() which can be used in place of eval() and new Function() respectively. In theory code run via evel has no access to the DOM, no access to the outside world beyond JavaScript builtins, and no access to your code's variables except those intentionally passed in.

It works by:

  1. Sanitizing the provided source against e.g. escape characters (immediately, which also serves to flag syntax errors at the expected time)
  2. Wrapping source in a "use strict"; environment to eliminate global access via this tricks
  3. Shadowing all non-ES5 globals (each time called!) to eliminate direct access via name
  4. …doing the last two steps using a clean iframe's JS environment to isolate {}.__proto__ effects

Basically instead of returning the provided code directly, we wrap it like this:

function ({{g1}}, {{g2}}, …, {{gN}}) {          // imagine {g1:'document', g2:'XMLHttpRequest', g3:'d3', … }
    "use strict";
    var fn = {{sanitizedSource}};
    return fn.apply(non_window_ctx, original_args);
}

Note that all bets are off if browser doesn't support strict mode, so we check for that and refuse to proceed if support is unavailable.

There are certainly some caveats.

The biggest known issue is that untrusted code can still do the equivalent of while (true) ; and lock the page up in an infinite loop. I don't see any particular way around this except to arrange your code to call evel from a separate iframe/worker execution context that won't block the main page.

Oh, and because the provided code necessarily runs under strict mode, it may break or throw an exception unexpectedly. Most code should be fine though and of course this doesn't make the sandbox ineffective. So long as you surround the call with a try/catch block, your code is safe.

Or is it?

Everything above is unproven!

I can't think of a way out of its sandbox, but maybe someone else can. Like its namesake, evel is attempting a pretty audacious feat. Will it successfully jump the canyon?

Unlike Evel Knievel's famous jump attempt, which was across an impressively beautiful part of the Snake River, the way evel it works is kind of ridiculously ugly. Like the original jump, it may end up parachuting down into the river if any more serious flaws are found in its execution.

Snake River canyon near the attempted derring-do

So before we go and trust our life support/nuclear launch codes/baby seals to it I thought it'd be fun to get some more eyes on it. I've built a "challenge page" that serves as both an example playground and as a call to more thorough investigation of potential vulnerabilities. Please share the demo with anyone you think might be interested.

comments

CouchDB is kinda postmodern

There is a hierarchy of use cases CouchDB fulfills:

The first is handy, but boring. It's great that with CouchDB I don't have to work through a bunch of boilerplate every time I want to spin up an API on top of a datastore. CouchDB is a datastore with a great API already included, but RESTful web middleware is nothing a little code generation or some developer typing practice couldn't already provide above any database. The only thing exciting about giving third-parties access to your database over HTTP is how easy it is to accidentally share too much of said data. (More on this in the context of CouchDB in another post, perhaps?)

The second starts to get interesting. As promoted, the "Syncable Lightweight Event Emitting Persistence" aspect of CouchDB's _changes feed allows clients to maintain an up-to-date mirror of an authoritative data set. This is powerful! I've often wanted "a CouchDB" of offline IETF RFCs, up-to-date Wikipedia articles, realtime Open Street Map edits, weather forecasts, you name it. Having "my own" copy of valuable information and being able to efficiently process secondary views of its contents would be liberating. I'm excited to see how far dat can take this idea.

Only the last use case makes CouchDB a truly masterless database, however.

What makes CouchDB [almost] unique among databases is not its _changes feed, but its revision trees! Let's call this feature DOZE (a stretch upon DecentraliZed ObjEcts), or perhaps NAP (for Normative-Agnostic Packratification), to be like the other two.

For every document, CouchDB also builds up metadata concerning its history, namely all known revisions ever. This revision metadata is not a simple ordered list, but a tree allowing multiple branches. When CouchDB replicates it uses this revtree lineage to decide whether an incoming change implies a direct modification to the local document, or a divergent "alternate universe" coming of age for said doc. For the "alternate universe" case, the revtree branches end up leading to multiple documents which co-exist simultaneously. DON'T WORRY "DOZE" GETS EVEN MORE PHILOSOPHICAL

So anyway, most of the documentation and even CouchDB's API itself, glosses over this important aspect of a given document's existence. When it must be explained the community calls it "conflict resolution", rather than the more accurate "branch management". What you need to understand is that the primary index CouchDB offers is not a key-value store. It is inherently a key-values store.

In most multi-node data architectures, "the truth" must be decided before a particular portion of the system is considered synchronized. The most common arrangement for keeping multiple "client" devices in sync to treat a "master" server somewhere as the Single Source of Truth, propagating all changes therethru. You could eliminate the central point of failure, but frequently all parties in such an arrangement still agree-to-agree that, in essence, there is One True Dataset; sync is a matter of eliminating deviations from this absolute.

Apple SyncServices requiring conflict resolution

With DOZE, there is no database. Sync becomes a matter of sharing state, not determining absolute truth. There is still an ideal; within a CouchDB "social network" a given key represents the identity of one particular document. All databases will eventually agree that doc._id exists/existed even if they end up with different contents. In practice, usually the document's type and usefulness within an app also remain consistent between all nodes. And with "standard" CouchDB there is even a veneer of complete system consistency — a polite lie of consensus that (based on its generation number or falling back to comparing random hash output) one revision will "win" against the others by default.

DOZE doesn't "resolve conflicts". CouchDB's actual replication algorithm simply compares and shares revision trees and the most recent known versions of each known document. This is a different model than people imagine based on what gets indexed locally or returned by a naïve GET request — even though on the surface CouchDB pretends like the conflicting versions don't exist, it is pushing and pulling all the options when it replicates. And they are options: you can continue to update a "losing" revision's branch just the same as a "winning" document's if you care to. Once you realize how winningly CouchDB's replicator treats all current versions of a given document you start to get extra annoyed at those 409 errors an individual database so adamantly throws to its own users.

Conflicts happen. They're harder to design an interface around, but this is the only reason CouchDB papers over them. If you want to "do masterless replication" you need to embrace conflicts. The difference between SLEEP and DOZE is the difference between publishing source code and accepting pull requests. SLEEP makes it easy to maintain a mirror, while DOZE makes it easy to maintain a fork. With DOZE you don't need to assume a neutral point of view exists, each database simply accepts only changes it agrees with. In theory, a future version of CouchDB could actually let in and replicate out document revisions that validate_doc_update rejects, without allowing them to "win" locally as far as the current style of indexes and app interfaces go.

In practice, I'm not arguing for an eminently tolerant database. CouchDB tends to assume disk is cheap, but I don't want my SSD wasted on your version of RFC 2616 replacing it with a copy of httpd's SVN repository. My disk, my definition of "normative" whenever possible. It's an interesting idea for allowing dissent "within" a higher level-application, e.g. Wikipedia itself — Ward Cunningham is building that, by the way — but in practice we're probably mostly interested in using CouchDB for offline apps (where the conflicts should indeed be resolved as soon as possible) and the more general DOZE for maintaining corrective patches to otherwise authoritative-but-changing datasets (where the "main server" branch should be left open and updates folded into the locally-patched revision).

In summary, if you're avoiding conflicts, you're probably not using CouchDB to its full potential. I'm as guilty of this as anyone, because developing on top of a single CouchDB database makes conflict resolution easy to "put off". And indeed, you don't have to resolve conflicts in CouchDB and when you do, you don't have to use the arbitrary "winning" revision to do so. Certainly if you're trying to join CouchDB's "masterless replication" ecosystem you'll need to donate some disk space to the cause — to wish there was only one current revision allowed is to wish the data model weren't really decentralized. Don't go back to throwing away data just because it looks old. Track your revtrees, deal with life's inevitable conflicts, and keep postmodernity weird!

comments


All posts

Subscribe