Server-Side File Delivery & Format Requirements

Edited

This article covers everything you need to know about preparing and uploading exposure data files for an S3 snapshot source. If you haven't created your S3 snapshot source yet, start with Creating an S3 Snapshot Exposure Source.

Delivery method

You upload .csv.gz files directly into MX8's S3 bucket. During onboarding, MX8 grants your AWS IAM role PutObject permission into a dedicated prefix (folder) within the bucket. Once that's in place, you can push files on your own schedule.

Delivery frequency

Daily delivery works best. Multiple drops per day are fine too — there's no limit on how often you upload. Choose a cadence that aligns with your data pipeline.

File format

Every file you upload must follow these specifications:

  • File type: .csv.gz (gzip-compressed CSV)

  • Encoding: UTF-8

  • Line endings: Unix (\n)

  • Delimiter: comma (,)

  • Header row: required

  • Row structure: one row per respondent exposure

  • Max file size: 2 GB compressed — split larger datasets across multiple files

Required columns

Your CSV must include columns that map to the following required fields. Your column names do not need to match MX8's field names — you specify the mapping when creating the source.

Required Field

Description

exposed_ip_address or hashed_ip_address

Plaintext IPv4/IPv6 address, or a hashed IP. Provide one or the other.

uid

Your user identifier, up to 128 characters. Appended to respondent records during matching.

brand

The brand name the user was exposed to, up to 128 characters.

You can include additional columns beyond these — MX8 will simply ignore any columns that aren't mapped. Just make sure to tell MX8 which of your columns correspond to each required field when setting up the source.

File naming

There's no required naming convention. That said, we recommend including a timestamp in each filename (e.g., exposures_2026-03-10T14-00.csv.gz) to avoid accidentally overwriting a previous upload and to make troubleshooting easier.

Duplicates and late data

If MX8 receives duplicate rows (same UID + IP + same day), it keeps the latest row based on file timestamp. If you discover errors in a previously uploaded file, upload a corrected version — the newer file's data will take precedence.

What not to include

Do not include PII beyond the specified fields. This means no email addresses, device IDs, or personal names. The only identifying information in your files should be the UID, brand, and IP address (plaintext or hashed).

Next steps

Was this article helpful?

Sorry about that! Care to tell us more?

Thanks for the feedback!

There was an issue submitting your feedback
Please check your connection and try again.