Create a population
Create custom populations from arbitrary user data
Population model creation is a paid feature on the Play and Launch pricing plans. Visit the account page in the dashboard to upgrade.
Custom populations allow you to make predictions for your specific audience segments, customers, or user groups. This guide shows how to create populations programmatically using the Semilattice API.
Prerequisites
Before creating a population, ensure you have:
- Seed data from a target group of users or customers in the required CSV format
- A valid Semilattice API key
- A paid subscription on the Play or Launch pricing plans
- The Semilattice SDK installed
Creating a population
Parameters
Parameter | Type | Description |
---|---|---|
name | string | Display name for your population |
description | string | Detailed description or documentation link |
seed_data | file/path | CSV file containing QA pair data |
reality_target | string | Description of the target user profile |
simulation_engine | string | Engine version (typically “answers-1”) |
simulation_options | object | Question type configuration for proper simulation (see below) |
run_test | boolean | Whether to automatically test accuracy after creation |
Simulation options
The simulation_options
parameter configures how questions are interpreted during simulation:
Field Descriptions:
-
question_number
: A 1-based index reference to the column in the seed data file. For example, if the first question column in your CSV is a single-choice question, you would set"question_number": 1
. -
question_type
: An enum describing the data type for that column in the seed data file:single-choice
: Column cells contain strings representing individual answers (e.g., “Python”, “JavaScript”)multiple-choice
: Column cells contain valid lists of strings representing sets of answers (e.g., [“Python”, “JavaScript”, “Go”])open-ended
: Column cells contain any string representing free text answers
-
limit
(optional): Formultiple-choice
questions only, sets the maximum number of choices a respondent can select during simulation.
Requirements:
- There must be one object for each question column in the seed data file in the “question_options” list field
- The
question_number
values should correspond to the actual column positions in your CSV file - Question types must match the actual data format in your seed data
This configuration ensures test simulations run correctly when evaluating population accuracy. For more details on data format requirements, see the Seed Data Requirements guide.
Population status flow
When you create a population, it goes through several status stages:
1. Processing
Initially, your population will have a status of Processing
while Semilattice processes and stores your data.
2. Untested
Once processing completes, the status changes to Untested
. At this point:
- The population is ready to make predictions
- We don’t yet know its estimated accuracy
3. Testing
When you trigger accuracy testing, either automatically or manually (see below), the status changes to Testing
while Semilattice evaluates the population’s prediction accuracy.
4. Tested
Once testing completes, the status becomes Tested
and the population will have accuracy metrics available for review.
Testing population accuracy
To understand how accurate your population will be, you need to run a population test.
Option 1: Automatic testing
Set run_test=True
when creating the population:
With automatic testing, your population will progress through these statuses:
Processing
→ Untested
→ Testing
→ Tested
Option 2: Manual testing
Alternatively, you can trigger testing manually on an Untested
population:
Understanding test results
Once testing is complete, your population will have:
- Status:
Tested
- Accuracy metrics: Estimated accuracy scores
Learn more about interpreting these metrics and performance thresholds in the Accuracy Evaluation section.