Long Excel Format

Edited

1. Overview

The Long format is designed for detailed, question-by-question analysis. It includes all the data stored in the platform and is identical to the data used internally for reporting. Each row represents a single respondent's answer to a single survey question. This format is particularly useful when you want to:

  • Explore how different questions were answered across respondents.

  • Pivot, filter, and group responses in tools like Excel, R, or Python.

  • Work with multi-choice and grid questions without needing to manage many columns.


2. File Structure & Layout

Each row corresponds to one answer to one option of one question from a respondent. Respondents therefore have multiple rows, one for each question they encountered.

Example (first 5 rows):

respondent_id

status

question

reporting_id

type

response

raw_response

responded

timestamp

weight

00044f3b-ec63-2e17-9b5c-970e0efd5a8b

Terminated

How old are you?

Age

NumericQuestion

35-44

36

1

2025-07-30 16:27:06.025

0.638

00044f3b-ec63-2e17-9b5c-970e0efd5a8b

Terminated

What is your gender?

Gender

MultiChoiceQuestion

Male

Male

1

2025-07-30 16:27:06.068

0.638

00044f3b-ec63-2e17-9b5c-970e0efd5a8b

Terminated

What is your gender?

Gender

MultiChoiceQuestion

Female

Female

0

2025-07-30 16:27:06.068

0.638


3. Key Columns

  • respondent_id – Unique identifier for each participant.

  • status – Final survey status (e.g., Completed, Terminated).

  • question – Full wording of the question asked.

  • reporting_id – The labelled identifier for the question as set in the dashboard (e.g., Age, Gender).

  • line_number – The line number of the question in the survey script.

  • type – Type of question (NumericQuestion, MultiChoiceQuestion, OpenEnd, etc.).

  • response – The recoded, human-readable response category (e.g., 35-44).

  • raw_response – The raw value stored (e.g., 36).

  • responded – Indicates whether and in what order the respondent selected the option. 0 = not selected, 1 = selected first, 2 = selected second, and so on.

  • timestamp – Time when the answer was submitted.

  • weight – Weighting factor applied to this respondent’s answers for statistical adjustment.


4. Data Representation

Single-choice questions

Stored as one row with responded=1.

Multi-choice questions

Stored as multiple rows per respondent per option. The chosen options have responded>0, with the number indicating the order in which the options were chosen. Unchosen options have responded=0.

Example: Multi-choice question

Question: Which of the following fruits do you like? (Select all that apply)

respondent_id

question

response

responded

r1

Fruits

Apple

1

r1

Fruits

Banana

0

r1

Fruits

Orange

2

Here, the respondent chose Apple first, Orange second, and did not select Banana.

Numeric questions

Both response (bucketed/cleaned category, e.g. 35-44) and raw_response (e.g. 36) are provided.

Open-end questions

The full text appears in response and raw_response.


5. Missing & Special Values

  • Non-responses may appear with responded=0 and empty raw_response.

  • "Prefer not to say" or similar options appear as normal response categories.

  • Terminated respondents may have partial rows depending on where they dropped out.


6. Weighting

  • Apply the weight column in analysis to ensure results reflect population targets.


7. Best Practices

  • Use pivot tables (Excel) or groupby (Python/Pandas) to aggregate responses.

  • For multi-choice questions, include all rows where responded>0 to capture all selected options. Use the order number if you need to analyze sequence of selection.

  • When comparing across formats, match on reporting_id (long) to variable codes (wide/SPSS).


8. When to Use Long Format

  • For deep exploratory analysis.

  • When handling multi-select or grid questions where wide format becomes cumbersome.

  • When exporting data into R/Python for custom cleaning, text analysis, or advanced visualization.

Was this article helpful?

Sorry about that! Care to tell us more?

Thanks for the feedback!

There was an issue submitting your feedback
Please check your connection and try again.