← Back to Architecture Software · Analytics

Python · KQL

The language of analysis. Python orchestrates the pipeline from acquisition on the CM4 to the predictive models in the cloud. KQL turns terabytes of telemetry into sub-second answers for the alert portal.

Python in IN-SIGHT

Python is the main language of the IN-SIGHT analytics stack, present at both ends of the system: edge and cloud. Its combination of development productivity, a mature scientific ecosystem (NumPy, SciPy, scikit-learn) and JIT compilation (numba) makes it viable both for the CM4's embedded code and for the analytics pipelines on Azure.

Edge — Raspberry CM4 (Python 3.11)

  • MEMS acquisition: SPI reading through spidev at 6,667 Hz with a double buffer to avoid gaps.
  • DSP: scipy.signal for IIR filters, Hann windows and FFT. Critical modules compiled with numba @jit to cut latency from 40 ms to 4 ms.
  • EKF: Vectorised NumPy implementation of the Extended Kalman Filter. State matrices in contiguous memory (C-order) for maximum cache efficiency.
  • MQTT: paho-mqtt with automatic reconnection and a persistent queue for resilience in tunnels.

Cloud — Azure Functions + AKS (Python 3.12)

  • Ingestion pipeline: Azure Functions process the IoT Hub events, enrich them with fleet metadata and write to ADX.
  • Cloud EKF: Parallel execution of the EKF for the whole fleet with Dask. One worker per vehicle, with automatic scaling.
  • Portal API: FastAPI serves the alert and trend data to the portal frontend.
  • Predictive models: scikit-learn for RUL models (Random Forest, Gradient Boosting). Monthly retraining with historical data from ADX.

KQL — Kusto Query Language

KQL is the declarative query language of Azure Data Explorer. Its philosophy is pipe-based: the result of each operator flows into the next, similar to Unix pipes but with table semantics and native optimisations for time series.

KQL includes specialised primitives that would be complex to implement in standard SQL: series_fit_line() for series regression, series_decompose_anomalies() for automatic anomaly detection, make_series to pivot data into dense temporal arrays, and cross-correlation functions between signals.

Key queries in IN-SIGHT

// Degradation trend — last 4 weeks
telemetry
| where metric == "bearing_rms"
  and vehicle_id == "TMB-5042"
  and timestamp > ago(28d)
| make-series rms_avg = avg(value)
    on timestamp
    from ago(28d) to now()
    step 1h
| extend anomalies = series_decompose_anomalies(
    rms_avg, 1.5, -1, 'linefit')
| render anomalychart

// Real-time fleet health summary
telemetry
| where timestamp > ago(5m)
| summarize
    last_rms   = take_any(value),
    alert_count = countif(alert_level != "OK")
    by vehicle_id
| join kind=leftouter (
    vehicles | project vehicle_id, line, depot
  ) on vehicle_id
| order by alert_count desc

Scheduled alerts with KQL

Azure Data Explorer allows defining scheduled queries: KQL queries that run automatically every N minutes and can trigger notifications when the result exceeds a threshold. IN-SIGHT uses this mechanism to detect slow trends that the local EKF would not catch (the EKF is optimal for sudden anomalies; gradual trends over weeks require historical context).

// Trend alert — runs every 6 hours
telemetry
| where metric == "bearing_rms"
  and timestamp between (ago(7d) .. ago(1d))
| make-series v = avg(value)
    on timestamp step 6h
    by vehicle_id
| extend (slope, intercept, rsq) =
    series_fit_line_dynamic(v)
| where slope > 0.005    // degradation > 0.5% RMS/hour
  and rsq > 0.7          // statistically significant trend
| project vehicle_id, slope, rsq