1. Overview
The SPSS (.sav) format presents data in the same wide structure as the Wide format export (one row per respondent, one column per question) but also includes additional metadata for each variable. This makes it the preferred choice for researchers using SPSS, Stata, or other statistical tools that can read .sav files.
2. File Structure & Layout
Each row corresponds to one respondent.
Each column corresponds to one question or metadata field.
Variable names follow the same convention as Wide format (e.g., V001_respondent_id, V002_Recent_Restaurant_Visit).
Example (first 5 columns):
V001_respondent_id | V002_Recent_Restaurant_Visit | V003_Age | V004_Gender | V005_Ethnicity |
0097ac15-868c-6608-25fc-c0fe2cd884a8 | Yes | 35 | Female | White |
012a5d8b-9dc4-2132-782d-73742be4088f | Yes | 21 | Female | White |
022b6b43-6304-2dc5-0830-10d98bd7dee3 | Yes | 18 | Male | White |
3. Key Columns
V001_respondent_id – Unique identifier for each respondent.
Question columns (V###_…) – Same as Wide format but with metadata.
Weight column (e.g., V171_weight) – Statistical weight for the respondent.
Timing columns (e.g., V172_start_time, V173_end_time) – Survey start and completion timestamps.
4. Data Representation
Single-choice questions
Stored as one column per question with the selected answer recorded.
Multi-choice questions
Each option is represented as a separate column. Values indicate whether the option was selected, and may also include rank order (e.g., -1 = not selected, 1 = selected first, 2 = selected second, etc.).
You may want to consider defining MRSETS within SPSS to facilitate easy reporting, as this is not something that MX8 Labs includes in the export file.
Numeric questions
Stored directly as numeric values.
Open-end questions
Stored as free-text responses.
⚠️ Note: Recoded variables and coded open ends are not included in the SPSS format (they are available in the Long format).
5. Metadata Provided
The SPSS format includes additional metadata that makes analysis easier:
Variable labels – Full question text (e.g., “What is your gender?”).
Value labels – Mappings of codes to human-readable labels (e.g., 1=Male, 2=Female, 3=Non-Binary).
Measurement levels – Nominal, ordinal, scale, etc., depending on the question type.
Missing value definitions – Explicitly marked missing values (e.g., -1 = Not selected).
Variable types – Numeric, string, date/time.
This metadata ensures the dataset is analysis-ready in SPSS and other statistical software.
6. Missing & Special Values
-1 typically denotes unselected or non-applicable options.
Empty cells may represent skipped questions.
“Prefer not to say” appears as a standard category.
7. Weighting
Apply the weight column in analysis to ensure results reflect target population.
8. Best Practices
Use the built-in metadata in SPSS/Stata to reduce manual labeling.
Leverage variable labels to quickly identify questions.
Use value labels to decode numeric response values.
Match V### codes with reporting_id from the Long format if you need to cross-reference.
9. When to Use SPSS Format
For researchers working in SPSS, Stata, or other statistical software.
When you want metadata-rich survey data without needing to import a separate codebook.
For advanced statistical analysis requiring labeled variables and categories.