1xxHash - Extremely fast hash algorithm 2====================================== 3 4xxHash is an Extremely fast Hash algorithm, running at RAM speed limits. 5It successfully completes the [SMHasher](http://code.google.com/p/smhasher/wiki/SMHasher) test suite 6which evaluates collision, dispersion and randomness qualities of hash functions. 7Code is highly portable, and hashes are identical on all platforms (little / big endian). 8 9|Branch |Status | 10|------------|---------| 11|master | [![Build Status](https://travis-ci.org/Cyan4973/xxHash.svg?branch=master)](https://travis-ci.org/Cyan4973/xxHash?branch=master) | 12|dev | [![Build Status](https://travis-ci.org/Cyan4973/xxHash.svg?branch=dev)](https://travis-ci.org/Cyan4973/xxHash?branch=dev) | 13 14 15 16Benchmarks 17------------------------- 18 19The benchmark uses SMHasher speed test, compiled with Visual 2010 on a Windows Seven 32-bit box. 20The reference system uses a Core 2 Duo @3GHz 21 22 23| Name | Speed | Quality | Author | 24|---------------|----------|:-------:|------------------| 25| [xxHash] | 5.4 GB/s | 10 | Y.C. | 26| MurmurHash 3a | 2.7 GB/s | 10 | Austin Appleby | 27| SBox | 1.4 GB/s | 9 | Bret Mulvey | 28| Lookup3 | 1.2 GB/s | 9 | Bob Jenkins | 29| CityHash64 | 1.05 GB/s| 10 | Pike & Alakuijala| 30| FNV | 0.55 GB/s| 5 | Fowler, Noll, Vo | 31| CRC32 | 0.43 GB/s| 9 | | 32| MD5-32 | 0.33 GB/s| 10 | Ronald L.Rivest | 33| SHA1-32 | 0.28 GB/s| 10 | | 34 35[xxHash]: http://www.xxhash.com 36 37Q.Score is a measure of quality of the hash function. 38It depends on successfully passing SMHasher test set. 3910 is a perfect score. 40Algorithms with a score < 5 are not listed on this table. 41 42A more recent version, XXH64, has been created thanks to [Mathias Westerdahl](https://github.com/JCash), 43which offers superior speed and dispersion for 64-bit systems. 44Note however that 32-bit applications will still run faster using the 32-bit version. 45 46SMHasher speed test, compiled using GCC 4.8.2, on Linux Mint 64-bit. 47The reference system uses a Core i5-3340M @2.7GHz 48 49| Version | Speed on 64-bit | Speed on 32-bit | 50|------------|------------------|------------------| 51| XXH64 | 13.8 GB/s | 1.9 GB/s | 52| XXH32 | 6.8 GB/s | 6.0 GB/s | 53 54This project also includes a command line utility, named `xxhsum`, offering similar features as `md5sum`, 55thanks to [Takayuki Matsuoka](https://github.com/t-mat) contributions. 56 57 58### License 59 60The library files `xxhash.c` and `xxhash.h` are BSD licensed. 61The utility `xxhsum` is GPL licensed. 62 63 64### Build modifiers 65 66The following macros can be set at compilation time, 67they modify xxhash behavior. They are all disabled by default. 68 69- `XXH_INLINE_ALL` : Make all functions `inline`, with bodies directly included within `xxhash.h`. 70 There is no need for an `xxhash.o` module in this case. 71 Inlining functions is generally beneficial for speed on small keys. 72 It's especially effective when key length is a compile time constant, 73 with observed performance improvement in the +200% range . 74 See [this article](https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html) for details. 75- `XXH_ACCEPT_NULL_INPUT_POINTER` : if set to `1`, when input is a null-pointer, 76 xxhash result is the same as a zero-length key 77 (instead of a dereference segfault). 78- `XXH_FORCE_MEMORY_ACCESS` : default method `0` uses a portable `memcpy()` notation. 79 Method `1` uses a gcc-specific `packed` attribute, which can provide better performance for some targets. 80 Method `2` forces unaligned reads, which is not standard compliant, but might sometimes be the only way to extract better performance. 81- `XXH_CPU_LITTLE_ENDIAN` : by default, endianess is determined at compile time. 82 It's possible to skip auto-detection and force format to little-endian, by setting this macro to 1. 83 Setting it to 0 forces big-endian. 84- `XXH_FORCE_NATIVE_FORMAT` : on big-endian systems : use native number representation. 85 Breaks consistency with little-endian results. 86- `XXH_PRIVATE_API` : same impact as `XXH_INLINE_ALL`. 87 Name underlines that symbols will not be published on library public interface. 88- `XXH_NAMESPACE` : prefix all symbols with the value of `XXH_NAMESPACE`. 89 Useful to evade symbol naming collisions, 90 in case of multiple inclusions of xxHash source code. 91 Client applications can still use regular function name, 92 symbols are automatically translated through `xxhash.h`. 93- `XXH_STATIC_LINKING_ONLY` : gives access to state declaration for static allocation. 94 Incompatible with dynamic linking, due to risks of ABI changes. 95- `XXH_NO_LONG_LONG` : removes support for XXH64, 96 for targets without 64-bit support. 97 98 99### Example 100 101Calling xxhash 64-bit variant from a C program : 102 103```c 104#include "xxhash.h" 105 106unsigned long long calcul_hash(const void* buffer, size_t length) 107{ 108 unsigned long long const seed = 0; /* or any other value */ 109 unsigned long long const hash = XXH64(buffer, length, seed); 110 return hash; 111} 112``` 113 114Using streaming variant is more involved, but makes it possible to provide data in multiple rounds : 115```c 116#include "stdlib.h" /* abort() */ 117#include "xxhash.h" 118 119 120unsigned long long calcul_hash_streaming(someCustomType handler) 121{ 122 XXH64_state_t* const state = XXH64_createState(); 123 if (state==NULL) abort(); 124 125 size_t const bufferSize = SOME_VALUE; 126 void* const buffer = malloc(bufferSize); 127 if (buffer==NULL) abort(); 128 129 unsigned long long const seed = 0; /* or any other value */ 130 XXH_errorcode const resetResult = XXH64_reset(state, seed); 131 if (resetResult == XXH_ERROR) abort(); 132 133 (...) 134 while ( /* any condition */ ) { 135 size_t const length = get_more_data(buffer, bufferSize, handler); /* undescribed */ 136 XXH_errorcode const addResult = XXH64_update(state, buffer, length); 137 if (addResult == XXH_ERROR) abort(); 138 (...) 139 } 140 141 (...) 142 unsigned long long const hash = XXH64_digest(state); 143 144 free(buffer); 145 XXH64_freeState(state); 146 147 return hash; 148} 149``` 150 151 152### Other programming languages 153 154Beyond the C reference version, 155xxHash is also available on many programming languages, 156thanks to great contributors. 157They are [listed here](http://www.xxhash.com/#other-languages). 158 159 160### Branch Policy 161 162> - The "master" branch is considered stable, at all times. 163> - The "dev" branch is the one where all contributions must be merged 164 before being promoted to master. 165> + If you plan to propose a patch, please commit into the "dev" branch, 166 or its own feature branch. 167 Direct commit to "master" are not permitted. 168