Why test?
The tests feature lets you benchmark a population model’s accuracy on specific types of questions. For example, you might benchmark a model’s ability to predict user reactions to marketing messages, generating accuracy estimates for that specific use case. While the population evaluation feature evaluates a population model against all of its seed data, the testing feature let’s you evaluate a model against separate data which is not part of the seed data.Key concepts
Ground truth data
Testing requires the target population’s true answer distribution for the question in order to calculate evaluation scores. This data is only used to calculate evaluation scores—it is not seen by the population model or simulation engine. Ensuring ground truth data is accurate is crucial to making tests informative and useful.Question types
The API supports testing two types of questions:Single-Choice
Simulates respondents selecting exactly one option from a list of choices.
Multiple-Choice
Simulates respondents selecting multiple options from a list of choices.