View Blockhash on Github: JavaScript | Python | C | Experimental video code

Welcome to Blockhash.

Blockhash is a set of libraries (Python and JavaScript, currently) implementing a variation of the perceptual image hashing algorithm described by Bian Yang, Fan Gu and Xiamu Niu in their paper Block Mean Value Based Image Perceptual Hashing

The 256 bit hashes that Blockhash generate are designed to be near unique for images, even after an image has been rescaled. The Hamming distance between two hashes (the number of bits that differ) indicate how far apart two images are, with single-digit values generally giving a good indication that the images are identical, even if they are of different size.

Experimental video support

Through 2016, we experimented with using the Blockhash algorithm to fingerprint videos. You can read more about the Videorooter project, and view our experimental video support.

Why Blockhash?

While working on developing the tools, we needed a way to perceptually compare images. Libraries like pHash and imgSeek do the job, but we needed something that had an easier algorithm to implement in JavaScript, and the block mean value-based image hash algorithm fits the bill perfectly. We have subsequently made a few tweaks to the algorithm to optimize it for speed and make it less susceptible to implementation variations.

Use case parameters

We've designed Blockhash for use cases with the following in mind:

  • Identifying derivative works is less important than verbatim re-use (ie., the algorithm doesn't need to match images which have been manipulated beyond resizing and format changes)
  • False positives (images that are not the same generate blockhashes that have less than 10 variations) should be kept at an absolute minimum and occur no more than on 1 out of 10,000 images in a random test set.
  • In-browser execution with JavaScript should be possible.
  • Blockhash variations between images of original size (above 640 pixels wide) down to thumbnail size (100 pixels wide) should be no more than 10 bits in 95% of cases.

Known limitations

For images in general, the algorithm generates the same blockhash value for two different images in 1% of the cases (data based on a random sampling of 100,000 images).

For photographs, the algorithm generates practically unique blockhashes, but for icons, clipart, maps and other images, the algorithm generates less unique blockhashses. Larger areas of the same color in an image, either as a background or borders, result in hashes that collide more frequently.

Getting started

You can try generating a few hashes by fetching the Python version and running it on your images:

 $ git checkout
 $ blockhash-python/ 


We're working on an RFC to describe the modified algorithm we're using in detail. You can follow the progress on Github.


C Python JavaScript


File an issue in the respective Github repository or mail