NOTE: Redirecting to updated article at https://www.lexalytics.com/technology/machine-learning
Does Semantria use machine learning algorithms? Does it perform any sort of classification based on the sentiment of the text?
Semantria uses supervised machine learning to develop its model for use in the product. Any of our model-based techniques (entity extraction, concept topics, even part-of-speech tagging) can be considered machine learning approaches.
Entity extraction is a good example, since we do not have a massive dictionary of all the possible names of people in the world, but we’ve annotated thousands of examples of how names of people appear in content. That allows us to use machine learning algorithms to train the entity extraction model so that when it's shown content it has never seen before, it can detect with a certain degree of accuracy the names of people based on similarities of the potential names of people it was trained on.
With regards to sentiment, we support model-based “classification” of text based on sentiment. However, we prefer the phrase-based approach because the results from a model-based approach are only as good as the examples on which the model was trained, and how closely they correlate to the content provided for classification. For example, if we train a model on general news stories, it may learn that things like war and crime are bad. If we then test the model with a movie review about a war film, the review for the film could be very positive. But in describing the content of the film, the model will detect words that it has seen before in news about war and could incorrectly assume the document is negative.