Get Fuzzy with LEVENSHTEIN
Discovered a super handy Postgres extension tonight: fuzzystrmatch
. This lil cutie is a real godsend when dealing with potentially crummy user input, such as, oh say, for a Rails project where you’re requiring your less-than-tech-savvy relatives and future inlaws to input their email address in order to access your wedding website. Note: said website is badass, built-from-scratch, and open source.
Here’s a visual:
Imagining that scenario, one might be concerned about poor conversion due to typos, misspellings, or any other of the myriad problems that plague “uncontrolled” user input. And extra sadly, poor conversion here translates to a sparsely attended wedding and future full of angry inlaws… no bueno.
Enter fuzzystrmatch
. Adding this extension to your Postgres database gives you some super handy fuzzy string matching functions, my favorite of which so far is levenshtein
. levenshtein
calculates the difference between two strings, aka the “edit distance”, and it’s based on this metric named after this guy.
You can use it to make some extra forgiving ActiveRecord queries, as laid out below.
> User.find_by('levenshtein(lower(email), lower(?)) <= 3', 'test@email.c')
=> #<User id: 1, email: "test@email.com">
> User.find_by('levenshtein(lower(email), lower(?)) <= 3', 'TEST@EMAILCOM')
=> #<User id: 1, email: "test@email.com">
> User.find_by('levenshtein(lower(email), lower(?)) <= 3', ' test@email.com ')
=> #<User id: 1, email: "test@email.com">
> User.find_by('levenshtein(lower(email), lower(?)) <= 3', 'test@@email.com')
=> #<User id: 1, email: "test@email.com">
> User.find_by('levenshtein(lower(email), lower(?)) <= 3', 'teste@mail.com')
=> #<User id: 1, email: "test@email.com">
I’m using it to make it extra easy for folks to log in, provided they can at least remember their own email within a 3 character margin of error. Here’s hoping.
Super Basic Implementation:
Add migrations
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :email, null: false, default: ""
end
end
end
class EnableFuzzystrmatchExtension < ActiveRecord::Migration
def change
enable_extension 'fuzzystrmatch'
end
end
Add class method on model
class User < ActiveRecord::Base
def self.fuzzy_match(email)
find_by('levenshtein(lower(email), lower(?)) <= 3', email.strip)
end
end
Query from controller action
class SessionsController < ApplicationController
def create
user = User.fuzzy_match(user_params[:email])
if user
# login user
else
# redirect with flash error
end
end
private
def user_params
params.require(:user).permit(:email)
end
end