Fowler-Noll-Vo (FNV) Hash Algorithm
Do not miss this exclusive book on Binary Tree Problems. Get it now for free.
The FNV (Fowler-Noll-Vo) hash algorithm is a non-cryptographic hash function designed for fast hashing of small to medium-sized data. It was created by Glenn Fowler, Landon Curt Noll, and Kiem-Phong Vo. The FNV algorithm is known for its simplicity and efficiency, making it suitable for applications where speed is crucial, such as hash tables for indexing & searching and checksums.
Table of Contents
- Introduction
- Basic Principles
- Idea behind using Prime Constants
- FNV-1 Algorithm
- FNV-1a Variant
- Use Cases
- Comparison with Other Standard Approaches
- Security Considerations
- Complexity Analysis
- Hashing Support in Programming Languages
Introduction
The FNV (Fowler-Noll-Vo) hash algorithm stands as a testament to simplicity and efficiency in the world of non-cryptographic hashing. Developed by Glenn Fowler, Landon Curt Noll, and Kiem-Phong Vo, the FNV algorithm has found its niche in applications where rapid hashing of small to medium-sized data is paramount, particularly in scenarios like hash tables and checksums.
Basic Principles
The basic principles behind the FNV (Fowler-Noll-Vo) algorithm involve simple arithmetic operations, primarily multiplication and XOR (exclusive OR), applied to each byte of the input data. The goal is to create a hash function that is fast, easy to implement, and provides a relatively uniform distribution of hash values.The two primary types are FNV-1 and FNV-1a.
The primary difference between FNV-1 and FNV-1a lies in the order of operations within the hashing loop. While both variants share the same basic structure and initialization process, they differ in how they combine the current hash value with the next byte of the input data.
Here is a high level overview of algorithm :
- Initialization : The process begins by initializing a hash value to a specific starting value. The starting hash value will depend on weather we are using 32-bit hash or 64-bit hash. Set this value as FNV_prime.
32-bit hash: 2166136261
64-bit hash: 14695981039346656037
- Hashing Loop : The algorithm iterates through each byte of the input data sequentially. For each byte, the hash value is updated using a combination of multiplication and XOR operations. The specific formula depends on whether it's FNV-1 or FNV-1a. For FNV-1 Algorithm updation formula is :
hash = (hash * FNV_prime) ^ byte
- Finalization : Once all bytes of the input data have been processed, the resulting hash value is considered the final hash.
In summary , the FNV algorithm relies on a simple and repetitive process of updating a hash value through a combination of multiplication and XOR operations for each byte in the input data. The use of prime constants in the multiplication step helps achieve a better distribution of hash values. The straightforward design of the FNV algorithm makes it suitable for applications where speed is crucial, such as in hash tables and checksums. However, it's important to note that FNV is not intended for cryptographic use due to its susceptibility to collision attacks.
Idea behind using Prime Constants
Now , the question is what is the reason behind choosing these number 2166136261
or 14695981039346656037
as initial values.
These constants are chosen carefully to provide a good starting point for the iterative hashing process, and they play a crucial role in achieving a uniform distribution of hash values.
-
Prime Constants : The constants
2166136261
and14695981039346656037
are both prime numbers. Prime numbers are chosen for their unique properties. Multiplying by a prime number helps in avoiding patterns in the input data that might result in hash collisions. -
Multiplication and XOR Operations : During the hashing loop, the hash value is updated for each byte in the input data using a combination of multiplication and XOR operations. The multiplication by a prime constant helps disperse the influence of each byte across the hash value, preventing patterns from emerging.
-
Avoiding Hash Bias : The goal is to avoid biasing the hash value toward specific bit patterns or regions. Biased hash values might lead to increased collisions, where different inputs produce the same hash code.
-
Uniform Distribution : The use of prime constants in the initialization and update steps contributes to a more uniform distribution of hash values. A uniform distribution is desirable in hash functions to minimize the likelihood of collisions and ensure that different inputs produce distinct hash codes.
In summary, the constants in the FNV hashing algorithm are carefully chosen prime numbers that play a crucial role in providing a solid starting point for the hashing process and contribute to the algorithm's ability to produce uniformly distributed hash values. The use of primes, combined with multiplication and XOR operations, helps achieve the desired characteristics of a good non-cryptographic hash function.
FNV-1 Algorithm
The core concept of the FNV-1 algorithm involves initializing a hash value and then updating it for each byte in the input data. Here's a step-by-step explanation of the FNV-1 algorithm:
-
Initialization: The process begins by initializing a hash value to a specific starting point. For FNV-1, the starting values are:
- 32-bit hash: 2166136261
- 64-bit hash: 14695981039346656037
-
Hashing Loop: For each byte in the input data, the hash value is updated using the formula:
- 32-bit hash:
hash = (hash * FNV_prime_32) ^ byte
- 64-bit hash:
hash = (hash * FNV_prime_64) ^ byte
where
FNV_prime_32
is 16777619 andFNV_prime_64
is 1099511628211. - 32-bit hash:
The multiplication by the prime constant helps achieve a good dispersion of hash values. The XOR operation combines the current hash value with the next byte of the input data.
- Finalization: Once all bytes in the input data have been processed, the resulting hash value is considered the final hash.
For 32-bit hash the c++ implementation is :
#include <iostream>
#include <cstdint>
#include <cstring>
const uint32_t FNV_prime_32 = 16777619;
const uint32_t offset_basis_32 = 2166136261;
uint32_t fnv1_32(const uint8_t* data, size_t length) {
uint32_t hash_value = offset_basis_32;
for (size_t i = 0; i < length; ++i) {
hash_value = (hash_value * FNV_prime_32) ^ data[i];
}
return hash_value;
}
int main() {
// Example usage
const char* input = "Hello, World!";
size_t length = strlen(input);
uint32_t hash_result = fnv1_32(reinterpret_cast<const uint8_t*>(input), length);
std::cout << "FNV-1 Hash (32-bit): 0x" << std::hex << hash_result << std::dec << std::endl;
return 0;
}
Execution :
g++ filename.cpp -o filename.exe
./filename.exe
Output :
FNV-1 Hash (32-bit): 0x4291a886
In this implementation:
FNV_prime_32
is the prime constant used in the multiplication step.offset_basis_32
is the initial hash value.fnv1_32
is the function that takes a pointer to the input data and its length, and it calculates the FNV-1 hash.
The main function provides an example of how to use the fnv1_32 function with a simple string.
For 64-bit hash :
#include <iostream>
#include <cstdint>
#include <cstring>
const uint64_t FNV_prime_64 = 1099511628211;
const uint64_t offset_basis_64 = 14695981039346656037ULL;
uint64_t fnv1_64(const uint8_t* data, size_t length) {
uint64_t hash_value = offset_basis_64;
for (size_t i = 0; i < length; ++i) {
hash_value = (hash_value * FNV_prime_64) ^ data[i];
}
return hash_value;
}
int main() {
// Example usage
const char* input = "Hello, World!";
size_t length = strlen(input);
uint64_t hash_result = fnv1_64(reinterpret_cast<const uint8_t*>(input), length);
std::cout << "FNV-1 Hash (64-bit): 0x" << std::hex << hash_result << std::dec << std::endl;
return 0;
}
Output:
FNV-1 Hash (64-bit): 0x7b5ea4c513c14886
In this implementation:
FNV_prime_64
is the prime constant used in the multiplication step.offset_basis_64
is the initial hash value.fnv1_64
is the function that takes a pointer to the input data and its length, and it calculates the FNV-1 hash for a 64-bit result.
The main function provides an example of how to use the fnv1_64 function with a simple string.
FNV-1a Variant
The FNV-1a (Fowler-Noll-Vo version 1a) variant of the algorithm is quite similar to FNV-1, but it changes the order of operations within the hashing loop. Here's the FNV-1a algorithm explained, along with the C++ code for a 32-bit hash:
-
Initialization: Initialize a hash value to a specific starting point.
- 32-bit hash: 2166136261
- 64-bit hash: 14695981039346656037
-
Hashing Loop: For each byte in the input data, update the hash value using the formula:
- 32-bit hash:
hash = (hash ^ byte) * FNV_prime_32
- 64-bit hash:
hash = (hash ^ byte) * FNV_prime_64
where
FNV_prime_32
is 16777619 andFNV_prime_64
is 1099511628211. - 32-bit hash:
-
Finalization: The resulting hash value after processing all bytes is considered the final hash.
FNV-1a C++ Code (32-bit hash):
#include <iostream>
#include <cstdint>
#include <cstring>
const uint32_t FNV_prime_32 = 16777619;
const uint32_t offset_basis_32 = 2166136261;
uint32_t fnv1a_32(const uint8_t* data, size_t length) {
uint32_t hash_value = offset_basis_32;
for (size_t i = 0; i < length; ++i) {
hash_value = (hash_value ^ data[i]) * FNV_prime_32;
}
return hash_value;
}
int main() {
// Example usage
const char* input = "Hello, World!";
size_t length = strlen(input);
uint32_t hash_result = fnv1a_32(reinterpret_cast<const uint8_t*>(input), length);
std::cout << "FNV-1a Hash (32-bit): 0x" << std::hex << hash_result << std::dec << std::endl;
return 0;
}
Output :
FNV-1a Hash (32-bit): 0x5aecf734
In this implementation, fnv1a_32
is the function that calculates the FNV-1a hash for a 32-bit result. The key change from FNV-1 is the order of XOR and multiplication operations within the hashing loop.
You can try to implement the same for 64-bit also by changing the offset and prime number.
Use Cases
The choice between FNV-1 and FNV-1a often depends on specific use cases, and both variants have their advantages and considerations. Here are some factors to consider when deciding between FNV-1 and FNV-1a:
-
Distribution of Hash Values: FNV-1a is often preferred for its improved dispersion properties. The XOR operation before multiplication may lead to a more uniform distribution of hash values, potentially reducing collisions in certain scenarios.
-
Simplicity and Speed: FNV-1 is slightly simpler than FNV-1a, as it performs multiplication before XOR in the hashing loop. In some cases, this simplicity may result in slightly faster performance.
-
Collision Sensitivity: FNV-1a might be chosen when collision resistance is a priority. However, the difference in collision resistance between FNV-1 and FNV-1a is often subtle and might not be a critical factor in many applications.
-
Compatibility: The choice between FNV-1 and FNV-1a can be influenced by existing systems and libraries. If compatibility with a specific implementation or system is important, you might follow the convention used in that context.
-
Personal or Organizational Preference: In some cases, the choice between FNV-1 and FNV-1a is a matter of convention or personal preference. Some developers or organizations may have a standard preference for one variant over the other.
In practice, both FNV-1 and FNV-1a are widely used, and the differences in their performance and collision characteristics might be minimal for many applications. The choice between them often depends on the specific requirements of the application and any existing conventions within a development environment. It's always a good idea to test both variants in the context of your specific use case to determine which one performs better for your particular requirements.
Comparison with Other Standard Approaches (e.g., SHA):
The comparison with other standard approaches :
- Security:
- FNV: FNV is designed for non-cryptographic purposes and prioritizes simplicity and speed. It lacks the security features required for cryptographic applications.
- SHA: SHA algorithms are designed to be secure against various cryptographic attacks. They provide a high level of security and are suitable for applications where data integrity is critical.
- Use Cases :
- FNV: FNV is ideal for scenarios where speed is crucial, such as in hash tables and checksums. It is not intended for cryptographic use.
- SHA: SHA algorithms are widely used in cryptographic applications, including digital signatures, certificate generation, and secure communication protocols.
- Collision Resistance:
- FNV: FNV may have a higher likelihood of collisions compared to cryptographic hash functions. It is optimized for speed rather than collision resistance.
- SHA: SHA algorithms are designed with a strong focus on collision resistance, making them suitable for applications where avoiding collisions is crucial.
- Complexity:
- FNV: FNV is relatively simple, using basic arithmetic operations. Its simplicity contributes to its efficiency for non-cryptographic purposes.
- SHA: SHA algorithms are more complex and involve intricate mathematical operations to provide strong security guarantees.
- Performance :
- FNV: FNV is known for its speed and simplicity, making it suitable for applications where low overhead is crucial.
- SHA: SHA algorithms prioritize security over speed, and they might be computationally more expensive.
Security Considerations
It's important to note that FNV is not suitable for cryptographic purposes. It lacks the necessary security features and is vulnerable to collision attacks.
In conclusion, the FNV hash algorithm can be used for non-cryptographic applications. Understanding its basic principles, variants, and proper implementation can help optimize performance in scenarios where speed is a critical factor. However, for cryptographic use cases, alternative hash functions with stronger security properties should be considered.
Complexity Analysis
The time and space complexity of the FNV (Fowler-Noll-Vo) hash algorithm depends on the size of the input data and the specific variant (FNV-1 or FNV-1a) being used.
Time Complexity:
The time complexity is generally linear with respect to the size of the input data, as the algorithm processes each byte in the input sequentially. The complexity is O(n), where n is the length of the input data.
Space Complexity:
The space complexity is constant, as the algorithm does not require additional memory proportional to the size of the input. It only needs a constant amount of memory to store the hash value and a few constants. The space complexity is O(1).
It's important to note that the simplicity of the FNV algorithm contributes to its efficiency, making it suitable for scenarios where low overhead and fast hashing are crucial. However, the trade-off is that FNV may not provide as strong collision resistance as more complex cryptographic hash functions.
When considering the FNV-1 and FNV-1a variants, their time and space complexity profiles are similar, as the differences mainly lie in the order of operations within the hashing loop. Both variants maintain the linear time complexity and constant space complexity characteristics.
Hashing Support in Programming Languages
C++ Standard Library (std::hash):
C++ provides a standard hashing mechanism through the <functional>
header, specifically the std::hash
template. While std::hash
itself is not FNV, it serves as a generic hash function that can be specialized for user-defined types. The C++ standard library uses different hash functions for fundamental types, including FNV-like approaches.
Here's an example of using std::hash
in C++:
#include <iostream>
#include <functional>
int main() {
// Example using std::hash for integers
std::hash<int> intHash;
size_t hashValue = intHash(42);
std::cout << "Hash value for 42: " << hashValue << std::endl;
return 0;
}
This example demonstrates the use of std::hash
for hashing an integer. The standard library provides specializations for other fundamental types. However, for user-defined types, developers can provide their own hash function specializations.
Python:
In Python, the built-in hash()
function is commonly used for generating hash values. Python does not directly expose the details of the hash function in the same way that C++ allows specialization. The hash function in Python is a part of the language's core and is used for various purposes, including dictionary keys and set elements.
# Example using hash() in Python
hash_value = hash("example_string")
print("Hash value:", hash_value)
Java:
In Java, the hashCode()
method is commonly used to obtain hash codes for objects. The hashCode()
method is part of the Object
class, and classes can override it to provide their own hash code implementations.
// Example using hashCode() in Java
public class Example {
public static void main(String[] args) {
String exampleString = "example_string";
int hashValue = exampleString.hashCode();
System.out.println("Hash value: " + hashValue);
}
}
While programming languages often provide built-in hash functions, the choice of hashing algorithm may not always be transparent. The specifics of the underlying hash function implementation depend on the language and version.
In C++, the std::hash template from the header provides a generic way to obtain hash values for various types. The standard library uses different hash functions for fundamental types, and for user-defined types, developers can specialize std::hash to provide a customized hash function.
To use the FNV algorithm within std::hash, you can specialize the template for specific types. Below is an example of how you might create a specialization for a user-defined type using the FNV-1a variant:
#include <iostream>
#include <functional>
#include <string>
// FNV-1a algorithm constants for 32-bit hash
const uint32_t FNV_prime_32 = 16777619;
const uint32_t offset_basis_32 = 2166136261;
// Specialization of std::hash for a custom type (e.g., std::string)
struct FNVHash {
template <typename T>
std::size_t operator()(const T& value) const {
std::size_t hash = offset_basis_32;
for (const auto& byte : value) {
hash = (hash ^ byte) * FNV_prime_32;
}
return hash;
}
};
// Usage example
int main() {
std::hash<std::string> stdStringHash; // Standard hash for comparison
FNVHash fnvHash; // FNV-1a hash
std::string exampleString = "Hello, World!";
std::size_t stdHashValue = stdStringHash(exampleString);
std::size_t fnvHashValue = fnvHash(exampleString);
std::cout << "Standard Hash: " << stdHashValue << std::endl;
std::cout << "FNV-1a Hash: " << fnvHashValue << std::endl;
return 0;
}
Key Takeaways (Fowler-Noll-Vo (FNV) Hash Algorithm)
- Efficient Non-Cryptographic Hashing: FNV algorithm excels in rapid hashing of small to medium-sized data, ideal for applications like hash tables and checksums.
- Simple Principles, Powerful Results: FNV's strength lies in simplicity, using basic arithmetic operations for fast and efficient hashing.
- FNV-1 vs. FNV-1a Variants: The choice between FNV-1 and FNV-1a depends on the order of operations, impacting dispersion and collision sensitivity.
- Considerations in Application: FNV-1a is favored for improved dispersion, while FNV-1 may offer slightly faster performance. Choice depends on factors like collision sensitivity and compatibility.
- Algorithm Characteristics: Takes O(n) time and O(1) space.
With this article at OpenGenus.org, you must have the complete idea of Fowler-Noll-Vo (FNV) Hash Algorithm.
Sign up for FREE 3 months of Amazon Music. YOU MUST NOT MISS.