flurs.utils.feature_hash

Utility functions for feature hashing that encodes a feature value to a vector.

Functions

feature_hash(feature, dim[, seed])

Onehot-encoded Feature hashing.

multiple_feature_hash(feature, dim[, seed])

Onehot-encoded feature hashing using multiple hash functions.

n_feature_hash(feature, dims, seeds)

N-hot-encoded feature hashing.

flurs.utils.feature_hash.feature_hash(feature, dim, seed=123)[source]

Onehot-encoded Feature hashing.

Parameters
  • feature (str) – Target feature value represented as string.

  • dim (int) – Number of dimensions for a hash value.

  • seed (float) – Seed of a MurmurHash3 hash function.

Returns

Onehot-encoded vector for feature.

Return type

array

flurs.utils.feature_hash.multiple_feature_hash(feature, dim, seed=123)[source]

Onehot-encoded feature hashing using multiple hash functions. This technique is effective to prevent collisions.

Parameters
  • feature (str) – Target feature value represented as string.

  • dim (int) – Number of dimensions for a hash value.

  • seed (float) – Seed of a MurmurHash3 hash function.

Returns

Onehot-encoded vector for feature.

Return type

array

flurs.utils.feature_hash.n_feature_hash(feature, dims, seeds)[source]

N-hot-encoded feature hashing.

Parameters
  • feature (str) – Target feature value represented as string.

  • dims (list of int) – Number of dimensions for each hash value.

  • seeds (list of float)) – Seed of each hash function (MurmurHash3).

Returns

n-hot-encoded vector for feature.

Return type

array