Help
Parquet Field Dictionary
Both HRA and CNS dashboards share the same CloudFront log schema (40 fields). This page documents all fields.
HRA (humanatlas.io)
Parquet: data/2026-04-06_hra-logs.parquet
Date range: 2023-06-06 – 2026-04-06
Sites: Apps, Portal, KG, API, CDN, Events
CNS (cns.iu.edu)
Parquet: data/2026-04-06_cns-logs.parquet
Date range: 2008-04-13 – 2026-04-06
Sites: CNS
Top-Level Parquet Fields
| Field | Type | Information | Used For |
|---|---|---|---|
| anon_id | VARCHAR · nullable | Anonymous identifier for a visitor/session actor. | Session grouping, retention, and repeat-visitor analytics. |
| date | DATE · nullable | Request date. | Daily/monthly/yearly trend views and period filters. |
| time | VARCHAR · nullable | Request time component. | Hour-of-day activity analysis. |
| x_edge_location | VARCHAR · nullable | CloudFront edge location code that served the request. | Infra routing diagnostics and coarse geo context. |
| sc_bytes | BIGINT · nullable | Response bytes sent by server. | Payload-size analysis and traffic profiling. |
| cs_method | VARCHAR · nullable | HTTP request method (GET, POST, etc.). | Request profiling and endpoint behavior checks. |
| cs_uri_stem | VARCHAR · nullable | Path part of the URL (without query params). | Tool/page visit counting and route-level analytics. |
| sc_status | INTEGER · nullable | HTTP status code returned to client. | Error-rate and reliability metrics. |
| cs_referer | VARCHAR · nullable | Incoming referrer URL header. | Source attribution and external ecosystem analysis. |
| cs_user_agent | VARCHAR · nullable | User-agent string from client. | Client profiling and bot/human heuristics. |
| cs_uri_query | VARCHAR · nullable | Raw URL query string. | Parameter extraction and event-context parsing. |
| cs_cookie | VARCHAR · nullable | Cookie header from request. | Session continuity and identity context. |
| x_edge_result_type | VARCHAR · nullable | CloudFront result category. | Infra/cache behavior diagnostics. |
| x_edge_request_id | VARCHAR · nullable | Unique CloudFront request id. | Request traceability and de-dup checks. |
| x_host_header | VARCHAR · nullable | Host header requested by client. | Domain-level segmentation and host routing checks. |
| cs_protocol | VARCHAR · nullable | Protocol scheme used by client request. | Transport-level diagnostics. |
| cs_bytes | BIGINT · nullable | Request bytes sent by client. | Upload/payload profiling. |
| time_taken | DOUBLE · nullable | Total time to serve request. | Latency and performance monitoring. |
| ssl_protocol | VARCHAR · nullable | TLS protocol version. | Security/transport compatibility diagnostics. |
| ssl_cipher | VARCHAR · nullable | TLS cipher used for the request. | Security posture and transport telemetry. |
| x_edge_response_result_type | VARCHAR · nullable | CloudFront response result type. | Delivery outcome and cache/error analysis. |
| cs_protocol_version | VARCHAR · nullable | HTTP protocol version. | Network/client compatibility analysis. |
| time_to_first_byte | DOUBLE · nullable | Time until first byte is returned. | Backend/network latency monitoring. |
| x_edge_detailed_result_type | VARCHAR · nullable | Detailed CloudFront result reason. | Infra troubleshooting for specific failure classes. |
| sc_content_type | VARCHAR · nullable | Response content MIME type. | Asset/API/document classification. |
| sc_content_len | BIGINT · nullable | Response content length. | Payload distribution and bandwidth analysis. |
| sc_range_start | BIGINT · nullable | Range start for partial content responses. | Media/file transfer diagnostics. |
| sc_range_end | BIGINT · nullable | Range end for partial content responses. | Media/file transfer diagnostics. |
| timestamp | BIGINT · nullable | Event timestamp in source format. | Event ordering and temporal feature engineering. |
| timestamp_ms | BIGINT · nullable | Event timestamp in milliseconds. | Fine-grained sequencing and latency math. |
| c_country | VARCHAR · nullable | Country derived from client/edge location. | Geo usage trends and country comparisons. |
| query | MAP(VARCHAR, VARCHAR) · nullable | Parsed query-parameter map. | Event/tool context and parameter-level analysis. |
| traffic_type | VARCHAR · nullable | Traffic classification label (for example likely human or bot). | Filtering analytics to desired traffic segments. |
| referrer | VARCHAR · nullable | Normalized referrer category/value. | Traffic-source reporting. |
| airport | VARCHAR · nullable | Airport/location code derived from edge location context. | Regional infrastructure distribution analysis. |
| month | INTEGER · nullable | Month component derived from date. | Monthly aggregation and trend charting. |
| day | INTEGER · nullable | Day component derived from date. | Daily aggregation and anomaly detection. |
| distribution | VARCHAR · nullable | Distribution/environment identifier. | Comparing traffic across deployment distributions. |
| site | VARCHAR · nullable | High-level site/category label for the event. | Separating Apps vs Events traffic in dashboards. |
| year | INTEGER · nullable | Year component derived from date. | Year-over-year trend analysis. |