• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

fuzzy matching

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    fuzzy matching

    Hi,
    I've got list of people's names that I need to match against another list of peeps

    Anyone know of any good routines for this kind of fuzzy matching?
    Maybe you know of a table of common mistakes. Mac = Mc etc Mike = Michael ?

    thanks very much

    #2
    Originally posted by Olly View Post
    Hi,
    I've got list of people's names that I need to match against another list of peeps

    Anyone know of any good routines for this kind of fuzzy matching?
    Maybe you know of a table of common mistakes. Mac = Mc etc Mike = Michael ?

    thanks very much
    I've used oracle's implementation of this before.

    Soundex - Wikipedia, the free encyclopedia
    While you're waiting, read the free novel we sent you. It's a Spanish story about a guy named 'Manual.'

    Comment


      #3
      There's also a version in MSSQL for this.

      See the following page

      Comment


        #4
        oops forgot to add....this is running under MS Access

        Comment


          #5
          This looks like the puppy Fuzzy Matching Demo in Access - CodeGuru

          a DB of names and uses a few fuzzy algorithms...looks like I've got my work cut out digging through it all

          Comment


            #6
            I've looked into this very heavily in the past.

            Arguably, SMITH and MSITH are the same name, due to a typo. Or SMITH and WMITH

            In my case, I used several algorithms to judge similarity between names and addresses, and weighted them to give a final score.

            Here's a link you may find useful:
            Approximate string matching - Wikipedia, the free encyclopedia

            Also Google for N-Gram, Levenstein and Hamming.

            Comment


              #7
              Originally posted by Olly View Post
              oops forgot to add....this is running under MS Access
              Maybe this will be useful...

              Comment


                #8
                right then...I'm getting there with this
                I've built a small table from existing matched data of names and abbreviations
                eg. William = Bill

                I'd really like a full multicultural data set...any pointers please?

                Comment

                Working...
                X