Population model creation is a paid feature on the Play and Launch pricing plans. Visit the account page in the dashboard to upgrade.
Custom populations allow you to make predictions for your specific audience segments, customers, or user groups. This guide shows how to create populations programmatically using the Semilattice API.

Prerequisites

Before creating a population, ensure you have:
  • Seed data from a target group of users or customers in the required CSV format
  • A valid Semilattice API key
  • A paid subscription on the Play or Launch pricing plans
  • The Semilattice SDK installed

Creating a population

from pathlib import Path
from semilattice import Semilattice

semilattice = Semilattice()

# See population options section below
population_options = json.dumps({
    "question_options": [
        {
            "question_number": 1,
            "question_type": "single-choice"
        },
        # { ... 18 other question option dicts ... }
        {
            "question_number": 20,
            "question_type": "multiple-choice",
            "limit": 3
        }
    ]
})

population = semilattice.populations.create(
    name="Developer Survey Q4 2024",
    description="Survey of software developers conducted in Q4 2024",
    seed_data=Path("developer_survey_data.csv"),
    reality_target="Software Developers",
    simulation_engine="answers-1",
    population_options=population_options,  # see below
    run_test=False
)

Parameters

ParameterTypeDescription
namestringDisplay name for your population
descriptionstringDetailed description or documentation link
seed_datafile/pathCSV file containing tabular QA pair data
reality_targetstringDescription of the target user profile
simulation_enginestringSimulation engine version (typically “answers-1”)
population_optionsobjectTells the API about the data in your seed data file
run_testbooleanWhether to automatically test the population after creation

Population options

Population options tell the API how to process your seed data. Semilattice supports single-choice, multiple-choice, and open-ended question types in seed data, so the API needs to know which of these types each column in the dataset represents.
population_options = json.dumps({
    "question_options": [
        {
            "question_number": 1,
            "question_type": "single-choice"
        },
        {
            "question_number": 2,
            "question_type": "multiple-choice",
            "limit": 2
        },
        # { ... question option dicts for all columns ... }
    ]
})
Population options object The populations options object must be a valid JSON object containing one question_options key. The value of question_options should be a valid JSON list containing one object for each question column in the dataset. The sim_id column does not need to be referenced. Question options field descriptions:
  • question_number: A 1-based index reference to the column in the seed data file these questions options describe. For example, when setting question objects for the 3rd question column in the dataset, which would be the 4th column in the CSV after the sim_id column, the question_number should be 3.
  • question_type: An enum describing the question type for that question column. Must be one of: single-choice, multiple-choice, open-ended.
  • limit (optional): For multiple-choice questions only, this sets the maximum number of choices respondents were allowed to select.
Requirements:
  • The question_options list must contain one object for each question column in the seed data file.
  • The question_number values should be integers corresponding to the 1-based column positions in your CSV file. The order of the question option objects doesn’t matter, but we suggest sequential ordering to avoid confusion.
  • The question_type specified for a column must match the actual data for that column in the CSV.
This configuration ensures test simulations run correctly when evaluating population accuracy. For more details on data format requirements, see Seed Data Requirements.

Population status flow

When you create a population, it goes through several status stages:

1. Processing

Initially, your population will have a status of Processing while Semilattice processes and stores your data.
print(f"Population status: {population.data.status}")
# Output: Population status: Processing

2. Untested

Once processing completes, the status changes to Untested. At this point:
  • The population is ready to make predictions
  • We don’t yet know its estimated accuracy
# Check status periodically
updated_population = semilattice.populations.get(population.data.id)
print(f"Status: {updated_population.data.status}")
# Output: Status: Untested

3. Testing

When you trigger accuracy testing, either automatically or manually (see below), the status changes to Testing while Semilattice evaluates the population’s prediction accuracy.

4. Tested

Once testing completes, the status becomes Tested and the population will have accuracy metrics available for review.

Testing population accuracy

To understand how accurate your population will be, you need to run a population test.

Option 1: Automatic testing

Set run_test=True when creating the population:
population = semilattice.populations.create(
    name="Developer Survey Q4 2024",
    description="Survey of software developers conducted in Q4 2024",
    seed_data=Path("developer_survey_data.csv"),
    reality_target="Software Developers",
    simulation_engine="answers-1",
    population_options=population_options,
    run_test=True  # Automatically test after processing
)
With automatic testing, your population will progress through these statuses: ProcessingUntestedTestingTested

Option 2: Manual testing

Alternatively, you can trigger testing manually on an Untested population:
# Trigger testing manually
population = semilattice.populations.test(
    population_id=population.data.id
)
# population.data.status will be "Testing"

Understanding test results

Once testing is complete, your population will have:
  • Status: Tested
  • Accuracy metrics: Estimated accuracy scores
Learn more about interpreting these metrics and performance thresholds in the Accuracy Evaluation section.