Create a population - Semilattice

Population model creation is a paid feature on the Play and Launch pricing plans. Visit the account page in the dashboard to upgrade.

Custom populations allow you to make predictions for your specific audience segments, customers, or user groups. This guide shows how to create populations programmatically using the Semilattice API.

Prerequisites

Before creating a population, ensure you have:

Seed data from a target group of users or customers in the required CSV format
A valid Semilattice API key
A paid subscription on the Play or Launch pricing plans
The Semilattice SDK installed

Creating a population

from pathlib import Path
from semilattice import Semilattice

semilattice = Semilattice()

# See simulation options section below
simulation_options = {
    "question_options": [
        {
            "question_number": 1,
            "question_type": "single-choice"
        },
        {
            "question_number": 2,
            "question_type": "multiple-choice",
            "limit": 3
        }
    ]
}

population = semilattice.populations.create(
    name="Developer Survey Q4 2024",
    description="Survey of software developers conducted in Q4 2024",
    seed_data=Path("developer_survey_data.csv"),
    reality_target="Software Developers",
    simulation_engine="answers-1",
    simulation_options=simulation_options,  # see below
    run_test=False
)

Parameters

Parameter	Type	Description
`name`	string	Display name for your population
`description`	string	Detailed description or documentation link
`seed_data`	file/path	CSV file containing QA pair data
`reality_target`	string	Description of the target user profile
`simulation_engine`	string	Engine version (typically “answers-1”)
`simulation_options`	object	Question type configuration for proper simulation (see below)
`run_test`	boolean	Whether to automatically test accuracy after creation

Simulation options

The simulation_options parameter configures how questions are interpreted during simulation:

simulation_options = {
    "question_options": [
        {
            "question_number": 1,
            "question_type": "single-choice"
        },
        {
            "question_number": 2,
            "question_type": "multiple-choice",
            "limit": 2
        }
    ]
}

Field Descriptions:

question_number: A 1-based index reference to the column in the seed data file. For example, if the first question column in your CSV is a single-choice question, you would set "question_number": 1.
question_type: An enum describing the data type for that column in the seed data file:
- single-choice: Column cells contain strings representing individual answers (e.g., “Python”, “JavaScript”)
- multiple-choice: Column cells contain valid lists of strings representing sets of answers (e.g., [“Python”, “JavaScript”, “Go”])
- open-ended: Column cells contain any string representing free text answers
limit (optional): For multiple-choice questions only, sets the maximum number of choices a respondent can select during simulation.

Requirements:

There must be one object for each question column in the seed data file in the “question_options” list field
The question_number values should correspond to the actual column positions in your CSV file
Question types must match the actual data format in your seed data

This configuration ensures test simulations run correctly when evaluating population accuracy. For more details on data format requirements, see the Seed Data Requirements guide.

Population status flow

When you create a population, it goes through several status stages:

1. `Processing`

Initially, your population will have a status of Processing while Semilattice processes and stores your data.

print(f"Population status: {population.data.status}")
# Output: Population status: Processing

2. `Untested`

Once processing completes, the status changes to Untested. At this point:

The population is ready to make predictions
We don’t yet know its estimated accuracy

# Check status periodically
updated_population = semilattice.populations.get(population.data.id)
print(f"Status: {updated_population.data.status}")
# Output: Status: Untested

3. `Testing`

When you trigger accuracy testing, either automatically or manually (see below), the status changes to Testing while Semilattice evaluates the population’s prediction accuracy.

4. `Tested`

Once testing completes, the status becomes Tested and the population will have accuracy metrics available for review.

Testing population accuracy

To understand how accurate your population will be, you need to run a population test.

Option 1: Automatic testing

Set run_test=True when creating the population:

population = semilattice.populations.create(
    name="Developer Survey Q4 2024",
    description="Survey of software developers conducted in Q4 2024",
    seed_data=Path("developer_survey_data.csv"),
    reality_target="Software Developers",
    simulation_engine="answers-1",
    simulation_options={},
    run_test=True  # Automatically test after processing
)

With automatic testing, your population will progress through these statuses: Processing → Untested → Testing → Tested

Option 2: Manual testing

Alternatively, you can trigger testing manually on an Untested population:

# Trigger testing manually
population = semilattice.populations.test(
    population_id=population.data.id
)
# population.data.status will be "Testing"

Understanding test results

Once testing is complete, your population will have:

Status: Tested
Accuracy metrics: Estimated accuracy scores

Learn more about interpreting these metrics and performance thresholds in the Accuracy Evaluation section.

Get Started

Learn

​Prerequisites

​Creating a population

​Parameters

​Simulation options

​Population status flow

​1. Processing

​2. Untested

​3. Testing

​4. Tested

​Testing population accuracy

​Option 1: Automatic testing

​Option 2: Manual testing

​Understanding test results

Prerequisites

Creating a population

Parameters

Simulation options

Population status flow

1. `Processing`

2. `Untested`

3. `Testing`

4. `Tested`

Testing population accuracy

Option 1: Automatic testing

Option 2: Manual testing

Understanding test results