Monday, June 22, 2015

Some random thoughts on webapp performance

A recent discussion got me thinking back in terms of website development, so I thought I'd jot down a few ideas that I'd like to try to implement when I get some free tuits. Maybe they'll spur some creativity in others, or at least you can tell me why I'm insane and this couldn't possibly work.

Server-side vs client-side rendering

Back when node.js was springing up in popularity, a lot of the focus around it was to do with how it uses an asynchronous event-loop to manage a lot of connections per process (since Javascript is single-threaded, parallelism is only managed by pre-emptive multitasking similar to how things worked back in the Windows 3.1 days).  I was confused by this for a few reasons.  Why not use process pooling to run multiple copies of the node server then?  You'd be able to peg each core with a separate process at least, and at least improve how many tasks you could do in parallel.  This would at least scale the number of requests you could serve simultaneously and maximize your hardware use.  Have each listen on a different port, throw a loadbalancer in front, voila!  Maybe they're doing that now, but there was no talk of it at the time.

Anyway, I've gone on a tangent.  For me, the killer use-case for having Javascript on the server is to share libraries on both server and client.  On the server side, your data source would be a database or an API or whatever you're using, on the client side, the server's REST interface would be the data source (or if you set up CORS, the original REST API could be the source on the client as well).  This would buy you a few things.

Pre-render the whole page server-side.  

One of the biggest problems with SPAs (Single Page Apps) is it completely offloads the rendering to the client.  You send down a minimal set of javascript code and templates, then the javascript code fires off a bunch of AJAX requests to pull down the data and renders the templates.  This works well on modern laptops and desktops, but for phones, resources are much more constrained.  Not only are you doing all of the rendering on their CPU, you're hammering the network with all of the AJAX requests to pull in data.  Rather than do that, you could, using the same exact rendering path as the client-side, pre-render the full page's initial state and send only that over the network.  Initial page load times will be much faster, and no more "the page is here, but wait while we load the data" frustration.

Render updates client-side.  

Now, here's the rub.  Once you have the page downloaded, you don't want to have to hit the server to re-render the whole page every time something changes.  That would just be silly.  Since you've got the same rendering paths available client-side, everything being Javascript and all, you could simply re-render the section of the page that was updated by the new data received by whatever AJAX request you sent.  Smaller, incremental updates done with a minimum of overhead.

It seems that my ideas were a bit ahead of their time.  I had a conversation with a coworker about this back in 2012 or so, and a recent poll suggests that this sort of dual-rendering path is finally becoming popular.

https://medium.com/javascript-scene/javascript-scene-tech-survey-d2449a529ed

The only advantage left for SPAs is that you can push out all of your content to the edges by using a CDN, but you can't push your REST interface to the edge, so all of those AJAX requests to grab data are still hitting the same limitation.  You could still put all of your Javascript and templates in a CDN and get some of the benefit there, I suppose.

Queued AJAX requests/multipart REST

Maybe something like this exists, but a quick googling hasn't found any results based on keywords I thought to look for.  I was thinking more about how to reduce the number of requests required by a website.  For context, at my previous job, we had a UI that was broken down into a bunch of components, and each component was responsible for updating itself via AJAX.  Some did this based on a timer, others did this based on a publish/subscribe framework that would alert them when their data source had potentially changed.  I wanted to expand this to use websockets to listen for updates coming in from the server-side as well, in case you had multiple users accessing the same account at the same time, but that use-case wasn't deemed common enough to justify the development effort, and the idea sat.  Anyway, in this case, we were very often firing off a bunch of AJAX GET requests to pull the latest data and re-render sections of the page.  So, given a framework like that, which I imagine isn't all that uncommon, I was thinking, "what if REST had a multipart GET request to grab a bunch of resources in a single request".  I haven't fully fleshed the idea out, but similar to how you upload files using multipart/form-data, you could do a multipart GET request.  The request would look something like:
GET /
Content-Type: multipart/rest; boundary=----------------------------12345
------------------------------12345
GET /resource/A
------------------------------12345
GET /resource/B
And the response would be something like:
Content-Type: multipart/rest; boundary=----------------------------12345
------------------------------12345
Location: /resource/A
Content-Type: application/json
Content-Length: 10
Status: 200 OK
{"a": "b"}
------------------------------12345
Location: /resource/B
Status: 404 Not Found
This would avoid the overhead of the network and HTTP protocol for all the additional requests and let you get a bunch of resources in a single request.  Since this would require modifying the HTTP spec, something a little easier would be to just have an API endpoint that does this for you in a single body:
GET /multiple?resource=/resource/A&resource=/resource/B
And the response would just encapsulate all of the resources in a single body:
Content-type: application/json
Status: 200 OK
{ "/resource/A": {"a": "b"}, "/resource/B": null }
To take advantage of something like this and still let your development be sane, you'd need a construct to queue up your AJAX requests rather than firing them immediately, then you could fire off a single "multiple" request on a predefined interval, say 100ms.  Something like this (this code would not work as-is):
var queue = [];
function get(url) {
    queue.push(url);
}
function parse_multiple(response) {
    for url in response {
        event.fire('data-received', url, response[url]);
    }
}
function multiget() {
    var query_params = ""; // build the query params from queue
    queue = [];
    $.ajax('GET', '/multiple?' + query_params, parse_multiple);
}
setTimeout(100, multiget);

Now, every 100ms you fire off a single AJAX request to get everything that needs to be updated rather than firing off a ton of ad-hoc AJAX requests for every individual item on the page.  Seems like that could work, but I haven't yet put it to the test.

On a similar note, to reduce the amount of redrawing you do in the browser, you could queue up DOM changes and do all of them at once on a timer as well.  That's a little more difficult to pull off, but if you built the new dom nodes in memory and just pointed them at the node that they were replacing, you could cache that in an object and since redraws only happen in between function calls (due to the single-threaded nature of Javascript, it has to wait until it exits user code to execute the browser code, unless something has changed recently).  So you could replace all the pending nodes in a single function and then let the browser redraw the whole thing after your function exited.  One redraw every 100ms is better than 100 redraws at random intervals, and fast enough that the user wouldn't notice the lag.  I dunno, maybe I'm taking crazy pills, since I haven't seen anyone attempt to do something like this in the major frameworks (or maybe they did while I wasn't looking).

Ok, good, I wrote those down so I can stop thinking about them and get back to what I'm supposed to be working on.  Hopefully some day I'll find out if they're feasible ideas or not.

No comments:

Post a Comment