Answer simulation lets you predict how a specific population would answer new questions.

Basic usage

from semilattice import Semilattice

semilattice = Semilattice()

result = semilattice.answers.simulate(
    population_id="population-id",
    answers={
        "question": "Which is worse?",
        "question_options": {"question_type": "single-choice"},
        "answer_options": ["Tech debt", "Confusing error messages"]
    }
)

Question types

Single-choice Questions

Simulates respondents selecting exactly one option:
result = semilattice.answers.simulate(
    population_id="population-id",
    answers={
        "question": "What's your preferred IDE?",
        "question_options": {"question_type": "single-choice"},
        "answer_options": ["VS Code", "IntelliJ", "Vim", "Sublime Text"]
    }
)

Multiple-choice Questions

Simulates respondents selecting multiple options:
result = semilattice.answers.simulate(
    population_id="population-id",
    answers={
        "question": "Which programming languages do you use regularly?",
        "question_options": {"question_type": "multiple-choice"},
        "answer_options": ["Python", "JavaScript", "Java", "Go", "Rust"]
    }
)

Handling async results

Simulations run asynchronously. The initial response has status: "Queued", and you need to poll for completion:

Initial response

{
    "data": [
        {
            "id": "84a92e29-54e6-4a60-862d-26cd2a78421e",
            "status": "Queued",
            "question": "Which is worse?",
            "answer_options": ["Tech debt", "Confusing error messages"],
            "simulated_answer_percentages": null,
            "ground_answer_percentages": null,
            "accuracy": null,
            // ... other fields
        }
    ]
}

Polling for results

import time

answer_id = result.data[0].id

while result.data[0].status != "Predicted":
    time.sleep(1)
    result = semilattice.answers.get(answer_id)

print("Simulation complete!")

Understanding results

The simulated_answer_percentages field shows the predicted distribution across your answer options:
{
    "data": {
        "id": "84a92e29-54e6-4a60-862d-26cd2a78421e",
        "status": "Predicted",
        "question": "Which is worse?",
        "answer_options": ["Tech debt", "Confusing error messages"],
        "simulated_answer_percentages": {
            "Tech debt": 0.5488,
            "Confusing error messages": 0.4512
        },
        "ground_answer_percentages": null,
        "accuracy": null
    }
}
This means:
  • 54.88% of the population would choose “Tech debt”
  • 45.12% would choose “Confusing error messages”

Important notes for simulated answers

No ground truth data

For simulated answers, these fields are always null:
  • ground_answer_percentages: We don’t know the real-world distribution
  • ground_answer_counts: We don’t have actual response counts
  • accuracy: We can’t measure accuracy without ground truth
  • mean_absolute_error: No ground truth to compare against
  • mean_squared_error: No ground truth to compare against
  • normalised_kullback_leibler_divergence: No ground truth to compare against
  • kullback_leibler_divergence: No ground truth to compare against

Best practices

  • Choose the right population: Select populations built on data similar to your target audience and question topics.
  • Review population metrics: Check the population’s accuracy scores before relying on predictions for critical decisions.
  • Use clear questions: Write unambiguous questions with distinct answer options to get reliable predictions.
  • Validate with benchmarks: Test your population with known questions before using it for important predictions.