for English or Chinese, search engine index page is word based, due to the broad and profound Chinese, and often a lot of difference between English words. Sometimes the same word, different punctuation position different tones, semantic completely different, but English is not such a problem exists, English is more of the words split. To introduce me some understanding of the Chinese word segmentation algorithm.
is the first matching method based on dictionary, according to user search words, search engine will match these words with their dictionary entry, if the match is successful, will cut off a word. At the same time according to the different direction, divided into forward and reverse two match. In a positive match, according to the words of different length and subdivided into maximum matching and minimum matching. This dictionary, largely depends on the integrity of the dictionary and update based on.
Based on this
, we should.
is the second word segmentation method based on statistics, the search engine will be a large number of calculations, including the probability of adjacent words, a phrase in what place, when a phrase or word to find what kind of content users to search, these are the basis for judgment based on search engine. This method has obvious advantages, there is a faster response to new words, such as when a news appeared, if we are to search the new words, and love the Shanghai judge does not come out, cannot give the correct search results, so the user will buy search engine accounts.
Based on this
segmentation algorithm is often Chinese for search engines, it does not exist for the noble baby, a search for the same keyword or phrase in Shanghai and noble baby, the result is different, this is not just a different algorithm or technology different reasons, more because of the existence of words segmentation algorithm. Love will go to Shanghai split according to user search keywords, and the noble baby is more directly to return to the results.
segmentation algorithm used occasionally understand some, but not thoroughly understand, recently to see many related books, and then go to the Internet and learning a part, be had a general understanding. In fact, understand the word segmentation algorithm both for the individual owners of small and medium enterprises are very helpful. The resolution of these words, we can make more accurate grasp of key words. Well, begin today’s text, if have a wrong place, also hope everybody.
, as a webmaster, whether we choose the long tail keywords page target keywords or the content page, should according to this principle, not artificial word, if your words are not people often search, nor is it the default word, then the search will not be returned, so when in the choice of keywords cannot be accurately judged most probably it did not actually happen.
general Chinese segmentation is divided into two kinds, based on dictionary and statistics, usually two methods there is not a single, but in the mixed use.