:   .., ., ..
:  
:  
:  
:  2018
:   ..
:  
:  73
:   .., ., .. // . 73. .: , 2018. .67-94. URL: https://doi.org/10.25728/ubs.2018.73.4
:  
:   , ,
(.):  phonetic encoding algorithms, phonetic distance, record linkage, indexing words by sound
:   , (). . SoundEx, NYSIIS, Daitch-Mokotoff, Metaphone, Polyphone , , N-. , . , .
(.):  This paper gives an overview of the phonetic encoding algorithms, designed to determine the similarity of words in sound (pronunciation). Phonetic encoding algorithms are divided into algorithms for comparing words and algorithms for determining the distance between words. Word comparison algorithms such as SoundEx, NYSIIS, Daitch-Mokotoff, Metaphone, Polyphone and algorithms for determining the distance between words such as Levenshtein, Jaro, N-grams are described. For each algorithm, its advantages and disadvantages are indicated, an analogue of the algorithm for the Russian language is given. To eliminate the common shortcomings of phonetic encoding algorithms, it is proposed to use not the sequence of letters of words, but the sequence of their elementary sounds. In this case, word recognition, record linkage, indexing words by sounds are expected to improve.

PDF
-

: 2678, : 870, : 11.


© 2007.