NOTE: Redirecting to updated article at https://www.lexalytics.com/technology/text-analytics
Facets and Attributes
For the Semantria engine, the number of facets indicates the frequently used nouns across a collection of documents. Each of these nouns is described by an adjective (or attribute in Semantria language). Facets and attributes can only be found using the collection processing mode because they represent a bird’s eye view of the content. Semantria’s engine will extract facets that appear at least twice in combination with a specific attribute.
For example, if “dirty”+”bed” appear at least twice across the collection of documents this combination of attribute and facet will be extracted.
An entity is any proper noun within the text. Semantria’s default is to identify any named entity such as a person, company, product, place and job title or regexn (regular expression) named entity such as a URL, phone number, hashtag, etc. Custom entities can also be created to identify specific proper nouns in the output.
Themes are main phrases from the whole text, which describe the meaning of the text. Normalization is not available for themes.
Document Themes vs. Collection Themes
Document themes are themes that are for a single document. Where as, collection themes are the common themes across a collection of documents and can be found using the advanced analysis mode in the Excel Add-in.
Document Themes vs. Entity Themes
Document themes are ideas from whole text, which represent the meaning of the text. Entity themes represent the meaning of a specific related entity only.
Topics are result output for either categories or queries.