Integer ID Encoder

Python implementation for encoding (usually sequential) integer IDs.

Algorithm details

A bit-shuffling approach is used to avoid generating consecutive, predictable values. However, the algorithm is deterministic and will guarantee that no collisions will occur.

The encoding alphabet is fully customizable and may contain any number of characters. By default, digits and lower-case letters are used, with some characters removed to avoid confusion between characters like o, O and 0. The default alphabet is shuffled and has a prime number of characters to further improve the results of the algorithm.

The block size specifies how many bits will be shuffled. The lower BLOCK_SIZE bits are reversed. Any bits higher than BLOCK_SIZE will remain as is. BLOCK_SIZE of 0 will leave all bits unaffected and the algorithm will simply be converting your integer to a different base.

Common usage

The intended use is that incrementing, consecutive integers will be used as keys to generate the encoded IDs. For example, to create a new short URL (à la bit.ly), the unique integer ID assigned by a database could be used to generate the last portion of the URL by using this module. Or a simple counter may be used. As long as the same integer is not used twice, the same encoded value will not be generated twice.

The module supports both encoding and decoding of values. The min_length parameter allows you to pad the encoded value if you want it to be a specific length.

Sample Usage:

>>> import idencoder
>>> x = idencoder.encode(12)
>>> print(x)
LhKA
>>> key = idencoder.decode(x)
>>> print(key)
12

Use the functions in the top-level of the module to use the default encoder. Otherwise, you may create your own IdEncoder object and use its encode() and decode() methods.

WARNING

If you use this library as part of a production system, you must generate your own unique alphabet(s). One alphabet per encoded entity type is recommended. Best practice is to configure the alphabet(s) as environment variables (like you do with credentials, right? ;-)) or to use random alphabets that are re-randomized each time your application is initialized. The latter approach will result in different encoded values for the same ID each time your application is initialized, but this may be acceptable.

For convenience, the library includes a random_alphabet() function that you can use to easily generate these unique alphabets. One easy way is to use the -r flag from the command line:

$ python idencoder.py -r
Random alphabet: 6nkqyxc4eabmvswfz8d9j5rhp27gt3u

And you can, of course, generate random alphabets programmatically:

>>> import idencoder
>>> alpha = idencoder.random_alphabet()
>>> print(alpha)
'c39htkrg5e7mvfn2uwap8sbj6zqdxy4'

Provenance

Original Author: Michael Fogleman
License: MIT
URL: https://github.com/brnt/idencoder

Changelog:

2014-05-09 Eric Wald (@eswald)

condensed duplicate bit-scrambling logic
switched to a native padding function
removed recursion and extra division
removed exponentiation and enumeration
removed excess convenience functions

2014-01-13 Brent Thomson (@brnt)

added random_alphabet() function
replaced main() method with a useful one

2013-10-11 Brent Thomson (@brnt)

minor bug fixes
minor code cleanup
renamed some functions to better reflect functionality
updated documentation to reflect function name changes and to better reflect the true nature of the module (it encodes serial integers, not URLs)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
idencoder.py		idencoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

idencoder.py

idencoder.py

Repository files navigation

Integer ID Encoder

Algorithm details

Common usage

WARNING

Provenance

Changelog:

About

Releases

Packages

Languages

brnt/idencoder

Folders and files

Latest commit

History

Repository files navigation

Integer ID Encoder

Algorithm details

Common usage

WARNING

Provenance

Changelog:

About

Resources

Stars

Watchers

Forks

Languages