Archive for August, 2008

base36 unique IDs with Python

On implementing double opt-in and -out options for mailing lists I needed “magic tokens”, i.e., strings which are unique for every email address in our databases. Widely used are MD5 hashes of some formatted time strings, such as the current date with microseconds. Or the latter are used to seed an random number generator for generating the MD5 or SHA1 hash.

As the result is a long integer it is classically displayed as hexadecimal. And stored without any further conversion as such – which is IMHO a waste of space. Why not using every 36 letters and digits?

In Python you can generate an unique ID by:

import uuid
uuid.uuid1()
# UUID('3208c170-743b-11dd-a60f-000e354e9618')
uuid.uuid4()
# UUID('cb7b64ca-068f-4590-9886-cf375d26f796')

Converting every part from base16 (hexadecimal) to base10 (decimal) is simple:

int('ff', 16)
# 255

Luckily Aloysio Figueiredo and Kip Bryan have published an one-liner to convert from base10 to any other radix:

def baseN(num,b):
  return ((num == 0) and  "0" ) or ( baseN(num // b, b).lstrip("0") + "0123456789abcdefghijklmnopqrstuvwxyz"[num % b])

Putting it together I got base36 unique IDs for my tokens by this Python code:

import uuid

def baseN(num,b):
  return ((num == 0) and  "0" ) or ( baseN(num // b, b).lstrip("0") + "0123456789abcdefghijklmnopqrstuvwxyz"[num % b])

def uuid1_base36():
    '-'.join([baseN(int(p, 16), 36) for p in str(uuid.uuid1()).split('-')])

Do you see any benefit from displaying and storing hashes in hexadecimal?

Hanover Linden – cogeneration plant

Today we have been to the cogeneration plant in Hanover Linden. It consists of three widely seen buldings, thus it is sometimes called “Die Drei Warmen Brüder” (the three warm brothers).

Should you ever search for the opportunity to utilize every single MP of your digicam that kind of location will be the one for you.

naming of chinese domain names – number magic


You have certainly come across Chinese domain names like XinHua’s or 21CN and wondered why there are so many sites whose name only contains of numbers, such as 163.com, 188.com, 888.cn or 888888.cn.

In Chinese, every number is pronounced by only one syllable, which in context can especially remind the reader of another word. So “163″ could be understood as “making profit on your way”.

Not only are numbers used by their quality of being homonyms but also assessed on their “yin and yang” quality. Odd numbers are considered yang (negative) and the even yin (positive). Again, a combination of those two is important.

Yet you don’t need to attend a numerology course: The top chinese websites have names written in Pinyin and today you can even register domain names in other alphabets than the Latin. So, get your “German machines” at 德国机器.com or enjoy some movies at tudou.cn or search for content on baidu.cn.

FieldJoiner Validator for DateTime

I’ve noticed on the FormEncode’s mailing lists Matthew Wilson asked about an validator joining two fields into one, so he can pass input of date and time directly to his database. Indeed, why did such an validator not exist? For your convenience, I’ve written it. Use it in chained_validators section of your validation schema:

class FieldJoiner(FancyValidator):
    """
    Joins fields into one by using the specified delimiter.

    ::

        >>> fj = FieldJoiner('fdatetime', ' ', ('fdate', 'ftime'))
        >>> fj.to_python({'fdate': '2008-08-08', 'ftime': '08:08:08'})
        {'fdatetime': '2008-08-08 08:08:08', 'fdate': '2008-08-08', 'ftime': '08:08:08'}
        >>> fj.to_python({'country': 'DE', 'zip': '30029'})
        {'country': 'DE', 'zip': '30029'}
    """

    result_fieldname = 'fdatetime'
    delimiter = ', '
    fields_to_be_joined = 'zip'
    __unpackargs__ = ('result_fieldname', 'delimiter', 'fields_to_be_joined')

    def validate_python(self, fields_dict, state):
        jl = [fields_dict[key] for key in self.fields_to_be_joined \
              if key in fields_dict]
        fields_dict[self.result_fieldname] = self.delimiter.join(jl)

Next Page »