What should we advise customers to do to optimize overall service performance?
Switch off whatever is not needed. For instance, if a customer only needs document sentiment, then switching off the other features may be helpful.
Are there any content or configuration best practices that improve Salience 5 performance?
Are there any content or configuration specifics about Salience that will slow down or crash the engine (ex: specific content length that crashes the engine) As in are there any specific things we should be telling customers to avoid?
A lot of effort has been put into the engine to harden it against poorly formed content or content of significant size. But, the content that can be fed to the Salience Engine is an unbounded universe. Longer text will take longer to process; it is up to the customer to determine what processing time he can accept. Poorly extracted HTML will result in poor results, simple garbage-in garbage-out rule. The benchmark content is 1kb-4kb of well-formed news article text. The farther you stray from that representative piece of content, performance and quality of results will start to drop off. Semantria has a default character limit of 65,535 characters in UTF 8 encoding.