Skip to contents

fuzzystring provides fuzzy inner, left, right, full, semi, and anti joins for data.frame and data.table objects using approximate string matching. It combines stringdist metrics with a data.table backend and compiled C++ result assembly to reduce overhead in large joins while preserving familiar join semantics.

Details

Main entry points are fuzzystring_join() and the convenience wrappers fuzzystring_inner_join(), fuzzystring_left_join(), fuzzystring_right_join(), fuzzystring_full_join(), fuzzystring_semi_join(), and fuzzystring_anti_join().

The package also includes the example dataset misspellings.

Author

Maintainer: Paul E. Santos Andrade paulefrens@gmail.com (ORCID)

Other contributors: