Archive for the ‘backbone.js’ Category

A funny story – featuring UTC offset

March 28, 2013

I don’t normally write about mistakes I make.

You know mistakes! You’re supposed to accept them, learn from them and never talk about them, especially online where your customers can read about them. No! You’re supposed to project an image of self-confidence, invulnerability and super human coding abilities.

Well, this mistake is funny and relatively harmless. I just have to write about it.

I’ve been writing before about the implementation of our CRM sync service. With every refresh, that sync service sends two things to the server: the timestamp of the last sync (to only get the delta from there) and the utc offset of the client (provided by the browser). This UTC offset is only reliable for getting the utc offset for that session, not to be reused as the UTC offset in general, and that’s because:

  1. The browser’s UTC offset is a naive offset, it’s not a timezone. In particular, it doesn’t know about daylight saving (although it applies it if in effect at the time of the request)
  2. The user might be travelling and using the service from a hotel in a totally different timezone

Anyways. We only use it in order to associate the dates and times with words like Today, Yesterday, etc. in the current session.

On the server, which uses python by the way, I use to have this naive handling of the UTC offset.


    try:
        utcoffset = int( request.GET.get('utcoffset', 0) )
    except:
        utcoffset = 0

The defaulting to 0 should never happen, as this parameter is always sent and the UTC offset, populated in javascript, should always be available in the browser. The try/except was more of a “just in case”.

So one day, I’ve decided that try/except doesn’t make sense, for the reasons highlighted above. Further more, I didn’t want the exception to be swallowed, I wanted to know about it. Django has a nice feature where you get an email every time an exception is thrown and not handled.

So I took that out, and my code now looked like this:


    utcoffset = int( request.GET.get('utcoffset', 0) )

Great, I thought. However, in my ignorance, I totally forgot about half hour timezones. India for example has a 6.5 UTC offset and there are half-hour timezones in Canada and Australia.

I caught this one pretty quickly when someone from India used the website and passed in a 6.5 offset. That line started throwing and flooding me with emails for every failure. Now this sync service actually polls for updates, so you can imagine I got quite a few emails.

Luckily for me, the fix was straightforward:


    utcoffset = float( request.GET.get('utcoffset', 0) )

Using a float instead of an int.

I suppose the moral of the story is twofold:

  1. Never swallow exceptions as the code might end up doing something you have not intended (using UTC offset 0 for half-hour timezones) and you will not know about it to fix it
  2. Learn about timezones

Paginate a Backbone.js collection

January 28, 2013

When you have too many results, you have to paginate, we all know that. With backbone.js there are different approaches, depending on whether you have all your data in the collection, or do you paginate “server-side” – that is, via calling a .fetch() on the collection every time you move to a new page, so essentially only storing one page at the time.
You could hold multiple pages, for example store page 1, currentPage – 1, currentPage, currentPage + 1, and the last page, in order to optimize the most common operations: move first, move previous, move next, move last.

In this article, I’m going to tackle a simpler scenario, when all the data is in the collection (in memory). No server round trips will be needed. In a subsequent article, I will build on this, to implement something more advanced. So let’s get started.

My first step was to enhance the collection so it can iterate over the selected page. I’ve added a so called “partialEach”, like .each, but only iterating over the given page.

Backbone.Collection.prototype.partialEach = function(offset, maxItemsPerPage, iterator, context) {
	for (var l = this.length; maxItemsPerPage !== 0 && offset < l; offset++) {
		var model = this.at(offset);
		if( model ) {
			iterator.call(context, model, offset, this);
			maxItemsPerPage--;
		}
	}
};

where:
   offset          - the offset within the colection of the element to start from (index of the first element on the page)
   maxItemsPerPage - the number of items per page
   iterator        - the callback, the function to call for each item (this will be used to render the element, or build the DOM or the html string for rendering)
   context         - a context to be passed back to the callback

Now, why do this, instead of a simple for loop, from offset to offset + maxItemsPerPage?
Because a simple pagination is generally not good enough. What if the user wants to filter the results and you have to paginate the filtered results? In that case, the for loop (offset to offset + maxItemsPerPage) doesn’t work anymore, as not all the items within that range will be included in the filter.

To support filters, I have modified the function above like this:

Backbone.Collection.prototype.partialEach = function(offset, maxItemsPerPage, iterator, context) {
	for (var l = this.length; maxItemsPerPage !== 0 && offset < l; offset++) {
		var model = this.at(offset);
		if( model && this.filterFunc( model, offset, this ) ) {
			iterator.call(context, model, offset, this);
			maxItemsPerPage--;
		}
	}
};

A filterFunc is just a function that takes a model and returns true or false. It has to be set on the Backbone.Collection.prototype in the similar way and then it can use a filterObject you can set on each individual collection, with the details of the actual search.

Now, a view that wants the items for a particular page needs to calculate the offset.
So how can a view calculate it?

For the scenario where there are no filters, it’s quite easy:

    offset = pageNumber * itemsPerPage;

with pageNumber starting from 0 to totalPages – 1.

But when you have a filter, it is not that straightforward. For this scenario, I will introduce the concept of a pageCache.
A pageCache will store for each page, the first index of the items on that page. This is the index where the search (filtering) should start from, it doesn’t mean that the first item will be included in the filter.

So, for the first page, pageCache will have:

   pageCache = { 0 : 0 }

First page starts (page number 0), starts from index 0. This will be true for all filters.
Rather than calculate all the others, we will be lazy here, for performance reasons, and only populate the pageCache as the user is searching.
Once we have the pageCache, we can calculate the offset as follows:

	getPageOffset: function(){
		if(!this.collection.hasFilter()){
			return this.pageNumber * this.itemsPerPage;
		} else {
			if(!this.pageCache){
				this.pageCache = { 0 : 0 };
				return 0;
			} else {
				return this.pageCache[this.pageNumber];
			}
		}
	}

This function is optimized for the scenario where the view/collection has no filter.
Then it gets the offset from the pageCache, it also builds the pageCache if it doesn’t exist.
With this offset, the view can then call into the partialEach to get the items it wants.

As for the lazy update, once the view iterates over the items, it keeps track of the lastIndex for each page, and then updates the pageCache:

	updatePageCache: function(pageNo, lastIndex){
		if(this.collection.hasFilter()){
			this.pageCache[pageNo + 1] = lastIndex + 1;
		}
	}

where:
        lastIndex is the last index on the page that has just been displayed, therefore the iteration on next page (pageNo + 1) will start from lastIndex + 1

This is just a rough implementation for you to get an idea. There are more exercises left for the user:

  • pageCache needs to be reset when the filter changes
  • determining whether there are more items that match the filter is not implemented (this needs to be done to know whether to show the Next button or not)
  • how do you deal with new items being inserted in the collection? The pagination technique above recovers on the second pass only (this might be sufficient)
  • how do you optimize operations like Move Last, which would require to iterate the whole list if there is a filter on and the pageCache is not populated

backbone.js

October 19, 2012

I’ve played a bit (a lot!) with backbone.js recently and it’s a great little framework, I love it. It is so easy to input javascript and get spaghetti that backbone.js, although very simple, helps quite a bit. It helps by giving you an (sort of) MVC structure to your code (a backbone!) and a REST-ful API to persist the app to the server.

While I love it, again, it’s very simplistic and recently I discovered one nasty bug – which I tried to convince the developers of backbone.js that it’s a problem, without much success: https://github.com/documentcloud/backbone/issues/1640

The issue is the collections keep a key/value object/dictionary mapping ids to models. It is called _byId. It is used by .get(id) on a collection, but it’s also used for internal features – like detection of duplicate models when you do a collection.add(). It’s fair to say it’s there to optimize the lookup of a model by id in the collection – a very common operation. Conceptually, it can be argued that duplicating the id-to-model relationship data is a recipe for disaster (duplicated in _byId and within the model), but I am not a purist myself, so I don’t have a problem with that.

In backbone.js, the id is supposed to represent the id of the object on the server. So an id is a unique identifiers across all sessions and across clients & server, and it does mean your model is persisted on the server. Contrast this with a cid, which is a client id, just there for the convenience of being able to refer to objects while they’re not persisted (they don’t have an id) and populated always for models.

Now the problem with _byId is the way it gets updated. When a model is saved, the request goes to the server (via ajax / REST api) and the server persists the model and returns the id. Upon receiving the id, backbone.js automatically updates the model with the id. It also uses a trigger/event on the model to update the collection’s _byId. This is still not a problem.

What is a problem is that the user can turn off all events, by doing a save with { silent : true }. No events will be triggered and the _byId collection will not be updated.

Now this is a classic example of having an internal private data structure (optimization in this case): _byId, relying on an external public feature (events) which the user of the API can turn on/off. This is a big problem because it affects the consistency of the internal data and because this error is not detected early and the point of failure is removed from the root cause. The failures you get with this are failures to find models within the collection, failures to detect and prevent duplicates to be added to the collection. Needless to say it is time consuming to troubleshoot problems like these and this is exactly what I found.

In the end, due to this problem not being accepted as a problem, I had to fix it on my side. And what’s worse is that I had to put it on the client side and not in backbone.js – because I wanted to avoid branching off and having problems every time I want to upgrade to a new version. So, I had to update the _byId mapping myself, a very ugly hack and one that is bound to fail if _byId semantics change.

backbone.js folks, if you’re reading this, please reconsider and fix this issue 🙂