Server-Side File Delivery & Format Requirements
This article covers everything you need to know about preparing and uploading exposure data files for an S3 snapshot source. If you haven't created your S3 snapshot source yet, start with Creating an S3 Snapshot Exposure Source.
Delivery method
You upload .csv.gz files directly into MX8's S3 bucket. During onboarding, MX8 grants your AWS IAM role PutObject permission into a dedicated prefix (folder) within the bucket. Once that's in place, you can push files on your own schedule.
Delivery frequency
Daily delivery works best. Multiple drops per day are fine too — there's no limit on how often you upload. Choose a cadence that aligns with your data pipeline.
File format
Every file you upload must follow these specifications:
File type:
.csv.gz(gzip-compressed CSV)Encoding: UTF-8
Line endings: Unix (
\n)Delimiter: comma (
,)Header row: required
Row structure: one row per respondent exposure
Max file size: 2 GB compressed — split larger datasets across multiple files
Required columns
Your CSV must include columns that map to the following required fields. Your column names do not need to match MX8's field names — you specify the mapping when creating the source.
Required Field | Description |
|---|---|
| Plaintext IPv4/IPv6 address, or a hashed IP. Provide one or the other. |
| Your user identifier, up to 128 characters. Appended to respondent records during matching. |
| The brand name the user was exposed to, up to 128 characters. |
You can include additional columns beyond these — MX8 will simply ignore any columns that aren't mapped. Just make sure to tell MX8 which of your columns correspond to each required field when setting up the source.
File naming
There's no required naming convention. That said, we recommend including a timestamp in each filename (e.g., exposures_2026-03-10T14-00.csv.gz) to avoid accidentally overwriting a previous upload and to make troubleshooting easier.
Duplicates and late data
If MX8 receives duplicate rows (same UID + IP + same day), it keeps the latest row based on file timestamp. If you discover errors in a previously uploaded file, upload a corrected version — the newer file's data will take precedence.
What not to include
Do not include PII beyond the specified fields. This means no email addresses, device IDs, or personal names. The only identifying information in your files should be the UID, brand, and IP address (plaintext or hashed).
Next steps
IP Hashing Rules for Exposure Data — if you're sending hashed IPs, review the formatting and algorithm requirements.
Setting Up S3 Snapshot Delivery — Partner Checklist — a condensed checklist you can share with your engineering team.
