LoginSignup
0
1

More than 5 years have passed since last update.

Bloom Filter

Last updated at Posted at 2017-10-22
  • データの有無を効率的に判断できるアルゴリズム
require "murmurhash3"

class BloomFilter

  SEED_BASE = 0xFBA4C795

  attr_reader :bits
  attr_reader :hash_funcs
  attr_reader :tweak

  def initialize(n, k, tweak)
    @bits = Array.new(n, 0) # ビット列
    @tweak = tweak          # 乱数
    @hash_funcs = k         # ハッシュ関数の数
  end

  # フィルタに要素を追加
  def add(data)
    calc_index(data).each do |i|
      bits[i] = 1
    end
  end

  # フィルタに要素が登録されているかを確認
  def contains?(data)
    calc_index(data).map{|i| bits[i]}.find{|i| i == 0}.nil?
  end

  private

  def calc_index(data)
    list = []
    hash_funcs.times do |i|
      seed = (i * SEED_BASE + tweak) & 0xffffffff
      hash = MurmurHash3::V32.str_hash(data, seed)
      list << hash % bits.length
    end
    list
  end

end

filter = BloomFilter.new(10, 2, 123456)

filter.add("bitcoin")
filter.add("monacoin")

filter.contains?("bitcoin")
# =>true

filter.contains?("litecoin")
# =>false

filter.contains?("bitcoin4")
# =>true
0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1