Archive for the ‘Python’ Category

A funny story – featuring UTC offset

March 28, 2013

I don’t normally write about mistakes I make.

You know mistakes! You’re supposed to accept them, learn from them and never talk about them, especially online where your customers can read about them. No! You’re supposed to project an image of self-confidence, invulnerability and super human coding abilities.

Well, this mistake is funny and relatively harmless. I just have to write about it.

I’ve been writing before about the implementation of our CRM sync service. With every refresh, that sync service sends two things to the server: the timestamp of the last sync (to only get the delta from there) and the utc offset of the client (provided by the browser). This UTC offset is only reliable for getting the utc offset for that session, not to be reused as the UTC offset in general, and that’s because:

  1. The browser’s UTC offset is a naive offset, it’s not a timezone. In particular, it doesn’t know about daylight saving (although it applies it if in effect at the time of the request)
  2. The user might be travelling and using the service from a hotel in a totally different timezone

Anyways. We only use it in order to associate the dates and times with words like Today, Yesterday, etc. in the current session.

On the server, which uses python by the way, I use to have this naive handling of the UTC offset.


    try:
        utcoffset = int( request.GET.get('utcoffset', 0) )
    except:
        utcoffset = 0

The defaulting to 0 should never happen, as this parameter is always sent and the UTC offset, populated in javascript, should always be available in the browser. The try/except was more of a “just in case”.

So one day, I’ve decided that try/except doesn’t make sense, for the reasons highlighted above. Further more, I didn’t want the exception to be swallowed, I wanted to know about it. Django has a nice feature where you get an email every time an exception is thrown and not handled.

So I took that out, and my code now looked like this:


    utcoffset = int( request.GET.get('utcoffset', 0) )

Great, I thought. However, in my ignorance, I totally forgot about half hour timezones. India for example has a 6.5 UTC offset and there are half-hour timezones in Canada and Australia.

I caught this one pretty quickly when someone from India used the website and passed in a 6.5 offset. That line started throwing and flooding me with emails for every failure. Now this sync service actually polls for updates, so you can imagine I got quite a few emails.

Luckily for me, the fix was straightforward:


    utcoffset = float( request.GET.get('utcoffset', 0) )

Using a float instead of an int.

I suppose the moral of the story is twofold:

  1. Never swallow exceptions as the code might end up doing something you have not intended (using UTC offset 0 for half-hour timezones) and you will not know about it to fix it
  2. Learn about timezones
Advertisements

django url rules and spaces

February 28, 2013

I’ve been caught out by the copy’n’paste programming style where you just copy snippets of code, test them and off you go, without too much thinking. It’s so appealing this copy’n’paste programming style. You’re essentially encapsulating complexity within that snippet, which you trust because you’re copying and pasting from a reference source or from a place in the code where the snippet “proved itself” to work.

The problem is of course that the new context you’re placing it in can be slightly different from the context where you’re copying from. Thus, bugs appear.

Recently I’ve been caught out by a simple django URL rule which I’ve copied from the django reference website. Here’s the rule:

(r’^operationstatus/(?P<status>\w+)$’, ‘myapp.backbone_view’)

This does not match statuses with spaces in them, e.g.:

http://HOST/operationstatus/Global%20Status

You get a 404.

My quick solution was to change the regex to:

(r’^operationstatus/(?P<status>.*)$’, ‘myapp.backbone_view’)

which works as expected.

Basic Monte Carlo

August 22, 2008

I’ve started looking at Monte Carlo methods recently, just for fun. Monte Carlo is a method that attempts to give a probabilistic solution to complex problems. Its premise is that by using random numbers, you can get an idea of how the solution looks like.
The most basic example that I’ve tried is to calculate the value of PI. The story goes like this…

You take a circle circumscribed by a square. Something like a darts board. Imagine now that you throw darts at it randomly. Some of the darts will land within the circle, some others will land in the square but outside the circle. The probabilities of these 2 events are proportional to the areas of the respective areas.

Here’s a good explanation of the theory and the formula:
http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html

A quick an dirty implementation that I came up with in 10 minutes in Python:

import random
from math import sqrt

radius = 100
hits_total = 0
hits_within = 0
decimals = 4
good_enough_pi = 3.1415926535897932384
approximate_pi = 0

def calculate():
    x = random.randint(0, radius)
    y = random.randint(0, radius)
    global hits_total, hits_within
    hits_total += 1
    if sqrt(x*x + y*y) <= radius:
    hits_within += 1

while(int((approximate_pi - good_enough_pi)
                  * (10 ** decimals)) != 0):
    calculate()
    approximate_pi = 4 * ( float(hits_within) / float(hits_total) )

print "Calculating PI with 2 decimals took %d tries" % hits_total
print "PI~=%s" % str(approximate_pi)

When running this algo, I’ve seen that perhaps 3 out 4 times it runs it completes really quickly for 3/4 decimals. The other one time it takes ages and it doesn’t converge. So in that sense is not reliable. It does converge but it can take millions of tries.
Not to mention trying to get more decimals.

Here I’m just comparing the approximate PI with a pre-calculated PI, but imagine applying Monte Carlo to problems when you can’t pre-calculate (real problems in other words). Well, the problem is that it is difficult to know when your result is good enough and you can stop.

Maybe you can use another Monte Carlo method to determine the probability of a Monte Carlo method to be reliable and converge quickly. Applying Monte Carlo methods recursively…

As I said, basic stuff…

Python and accidental function overloading

August 21, 2008

Today I discovered one dark side of Python. If you accidentally have two functions with the same name, then the last one is considered, and the first one is like it never existed. There is no overloading based on parameters (I was about to say types :-). Python will just call the second function (second as it appears in the source code). If the number of parameters don’t match, you will get a runtime error, but if the number of parameters match, it’ll be executed silently.
My accidental overloading resulted from the following sequence of events:

1. I had a function DoOperation
2. At a later date I realized I need to break that into 2
DoOperation
DoOperationAck
where DoOperation still does the main processing, but DoOperationAck sends the result or acknowledgement to whoever is interested in the result.
The only problem is I had forgotten to rename, so I ended up with 2 DoOperation(s).

While this was a programming defect I’ve introduced, Python surely didn’t make it easy for me to catch that.

Nasty.

Python, the beauty and the beast

June 26, 2008

I’ve started developing in Python at work about 2-3 months ago. I like Python as it has a very intuitive syntax and it’s very readable (the forced indentation probably helps a lot). It is really easy to write code using Python and the result is much cleaner than in other languages.

That being said …

One thing that catches me out however is my C++ inherited reliance on the compiler to catch trivial errors. With Python, I have to be more careful before I run the code and also test the code properly (and I mean every function/branch, etc.) just to make sure that it doesn’t crash because some uncaught silliness.

One other beginner’s mistake is to confuse global with local variables, a mistake that the interpreter punishes you silently for. For example:

global1 = ”

def ChangeGlobal():
global1 = ‘Hello globals!”

ChangeGlobal()
print global1

If you think the snippet above will greet you with a “Hello globals!” you better read that Python book again. What happens here is that within the ChangeGlobal function, the interpreter silently defines a local variable named global1 and assigns it the value ‘Hello globals!’. At the end of function call, the local is thrown away as the stack unwinds. The global variable global1 is unaffected as it refers to something else entirely.

The fix is simple once you realize what’s going on. You can change the function to this:

def ChangeGlobal():
global global1
global1 = ‘Hello globals!”

The line “global global1” tells the interpreter to search for a global named global1 instead of instantiating a local named global1.

All simple stuff that won’t power rockets, the only problem is that everything happens silently and you can fall foul without knowing. Maybe a bit confusing, the interpreter allows reads from globals without having to introduce them with the ‘global’ keyword. Which creates a sense of false security.

There are tools like pyChecker that help you catch problems, but they won’t catch all of them. And then, there’s the required discipline of running the tool regularly. Sometimes if you use some combination of C++ and Python, it makes the whole task much more difficult.