Recoding and post-field data wrangling

However much you prepare upfront, there are always times after fieldwork when you need to make changes. Maybe a new grouping is needed, a typo slipped through, or you discover that a question needs to be bucketed differently for analysis.

To handle this, a standard recode script is generated automatically based on your survey. This script includes placeholders for getting values, recoding responses, marking poor quality respondents, and creating new calculated variables. You can then edit and extend this script as you need.

You can access this by hitting the data prep icon from the reporting page:

This will display the standard recode script for your survey, which will:

Read the value of each variable
Implement the recodes defined in your survey
Example code to create calculated variables and mark poor quality respondents

Each of the functions is explained fully below.

`recode(reporting_id, recodes)`

Why you'd use it:

Clean up messy text input (e.g., standardizing "Nissan" vs "nissan")
Collapse multiple options into grouped buckets (e.g., "18-24" and "25-34" into "Young")

Example:

r.recode("Q1", {"Male": "M", "Female": "F"})

`mark_poor_quality(respondent_ids)`

Why you'd use it:

Remove respondents you've decided don't qualify based on manual analysis of their responses.
You can get the respondent ID from the raw data downloads

Example:

r.mark_poor_quality(["respondent_1", "respondent_2"])

`store_value(name, value)`

Why you'd use it:

Create a new derived field that wasn't captured directly (e.g., "is_young", "heavy_user")
Prepare new reporting variables without changing the original questions

Example:

r.store_value("is_young", 1 if age < 35 else 0)

`get_values(reporting_id)`

Why you'd use it:

Retrieve all selected responses for a multi-select question to use the results in a new calculated variable.

Example:

selected_brands = r.get_values("Q_brands")

`get_value(reporting_id)`

Why you'd use it:

Retrieve the single answer from a question where only one choice is allowed, to use the results in a new calculated variable.

Example:

gender = r.get_value("Q_gender")

Notes

recode() replaces or maps answers without modifying the original capture
mark_poor_quality() updates status for final data output
store_value() lets you add new derived variables without editing the original survey
Always validate that your recodes match the correct reporting IDs — typos will raise errors with suggestions
You can customize the generated recoding script as much as you need to fit your analysis plan

API Reference

Creating Calculated Variables

Setting Up A Conjoint Study

Reporting with MX8 Labs

Survey programming cookbook