← Back to Architecture Cloud · Data analytics

Azure Data
Explorer

Columnar time-series engine that stores the telemetry of the whole fleet and answers KQL queries over terabytes of vibration data in under two seconds.

What it is

Azure Data Explorer (ADX), based on the Kusto engine developed internally by Microsoft, is a time-series data analytics service designed for massive ingestion and interactive analytical queries. Its columnar storage architecture with aggressive compression (LZ4 + Huffman coding) makes it especially efficient for IoT data where the values of the same metric have high temporal correlation.

ADX is not a transactional database: it is optimised for the "write once, read many times" access pattern typical of industrial telemetry. Queries are expressed in KQL (Kusto Query Language), a declarative pipe-based language with specialised primitives for time series.

Role in IN-SIGHT

  • Central telemetry store: It receives vibration, temperature and door telemetry from the whole fleet via IoT Hub through Event Hub. Each record includes vehicle-id, pod-id, timestamp (nanosecond precision), metric and value.
  • Health-metric computation: Scheduled queries compute the estimated RUL (Remaining Useful Life), bearing degradation indices and vibration energy percentiles per subsystem.
  • Comparison with Golden Run: The cloud EKF queries ADX to obtain the vehicle baseline and compute the innovation (difference between predicted and measured state).
  • Feeding the portal: KQL queries are the data source for the alert portal and the real-time technical dashboards via ADX's native REST API.
  • Historical retention: Historical data is retained in the "hot" tier (SSD, access < 1 s) for 90 days and in the "cold" tier (Azure Blob Storage) indefinitely for audit and model retraining.
KQL query example: Detection of a bearing degradation trend over the last 7 days for a specific vehicle, with a 1-hour moving average.
telemetry
| where vehicle_id == "TMB-5042"
  and metric == "bearing_rms"
  and timestamp > ago(7d)
| summarize avg_rms = avg(value)
    by bin(timestamp, 1h), pod_id
| order by timestamp asc
| extend trend = series_fit_line_dynamic(
    avg_rms, timestamp)

Internal architecture

ADX organises data into immutable columnar extents (shards) of ~1 GB compressed. When a new batch of telemetry is ingested, the engine creates new extents and indexes them; it periodically merges them to optimise query performance.

Each extent's index includes the timestamp range and the min/max values of each column, which lets the query planner discard whole extents without reading them when the query filters by time range or by a specific vehicle.