We see some sentiment scores are that are larger than the expected [-1,1] range. Is there a defined range for the sentiment score?
Yes, the documented sentiment score should spread between -1 and 1. However, due to the logarithmic nature of the sentiment score, the engine may respond with higher or lower values to -1 and 1. At the same time the values will always fit 2 modulo.
So you can consider -2 as very negative and 2 as very positive ranges.
When calculating a simple average of the total sentiment scores the larger values skew the average. Is there a different approach to aggregating sentiment scores?
Yes, getting the simple average for the sentiment score will result a neutral score because negative values will counteract positive values.
Instead, we recommend calculating the P/N ratio, which will show you how negative or positive the sentiment score is across the collection of documents.
You can find the P/N calculation formula on Sentiment Analysis support article.
Are the sentiment scores for documents, themes, etc. all measured on the same scale?
No, for all components including themes, topics, and entities the sentiment score is spread between -10 and 10 values.
Document sentiment scores range from -1 to 1.
For more details please consult our Sentiment Analysis support article.
How can I aggregate evidence and/or strength values across documents to identify the most popular themes?
For themes, names, and user entities, the best way to get the most "popular" records is to use their evidence score. The evidence score demonstrates how confident the engine is with its sentiment extraction of these records.
For query topics you need to look at the query relevancy or "hitcount" property for concept topics on "strength_score". At the same time Semantria already does this work for you. We're cutting all output based on the above values according to provided limits.
In short, if you set the "named_entities_limit" to 3 in your configuration, Semantria will return the most reasonable of the extracted values based on this evidence score.