Majorbird for RAKwireless
Architecture Proposal

From Polling
to Event-Driven

A scalable architecture redesign for RAKwireless — replacing synchronous HTTP calls and scheduled actions with an event-driven message bus, enabling real-time data flow, fault tolerance, and horizontal scalability.

Scroll to explore

How It Works Today

The current system relies on scheduled actions (cron polling) for inter-service communication and direct synchronous HTTP calls for external APIs. Every data exchange requires either a timer-based query or a blocking request.

☁️ Online / Cloud
🛒Shopify
⚙️Odoo ERP
HK ↔ CN multicompany
📡WISDM API
device activation
📦Shipping Forwarder
label generation
⚡ Network Boundary — Factory LAN
🏭MES (Offline Odoo)
scanning, WO processing, SN/EUI capture
🔧Machine Bridge
PCBA testers, local machines
🖨️Print App → CUPS
polls print_job table
Cron polling (scheduled action)
Synchronous HTTP call
Blinking = recurring load

Why This Doesn't Scale

The current polling-and-HTTP architecture creates compounding performance, reliability, and maintainability issues as traffic grows.

ISSUE 01

Database Pressure From Polling

Scheduled actions query large tables every cycle — even when nothing has changed. During peak scanning, MES tables grow rapidly and these repeated queries compete for the same resources workers need.

ISSUE 02

Synchronous HTTP Fragility

When WISDM or the shipping forwarder is slow or unreachable, the Odoo cron blocks or fails silently. There's no retry mechanism, no dead-letter handling, and no visibility into what was lost.

ISSUE 03

Latency From Cron Intervals

MES ↔ ERP sync waits for the next cron tick. Print jobs wait for the next polling cycle. Every data flow has an artificial delay dictated by timer intervals, not by actual data readiness.

ISSUE 04

No Backpressure or Buffering

Machines push directly to MES API. During production peaks, MES has no way to say "slow down." It either processes everything or drops requests. There's no queue to absorb bursts.

ISSUE 05

Tight Coupling Between Services

Every integration is a point-to-point connection. Adding a new consumer (analytics, a new API, monitoring) means modifying existing scheduled actions and creating new HTTP endpoints.

ISSUE 06

No Observability

When something fails, there's no centralized view. Failed HTTP calls may log to Odoo but there's no cross-system tracing. Understanding "what happened to order X" requires checking multiple systems manually.

Event-Driven with Message Bus

Replace polling and direct HTTP calls with a federated RabbitMQ message bus. Each service publishes events and subscribes to what it needs. Each Node-RED / bridge container is isolated per service — no credential sharing across boundaries.

☁️ Online / Cloud
🛒Shopify
existing connector
⚙️Odoo ERP
HK ↔ CN multicompany
🔴Node-RED / Bridge
Odoo ↔ MQ adapter
🐇RabbitMQ — Online Broker
mo.*wo.*shipping.*wisdm.*mes.sync.*
📊Monitoring
Grafana + Prometheus
🔴Node-RED / Bridge
WISDM consumer
📡WISDM API
🔴Node-RED / Bridge
shipping consumer
📦Shipping API
⚡ Federation / Shovel — Store & Forward
🐇RabbitMQ — Offline Broker
mo.*wo.*mes.sync.*print.*machine.*
🔴Node-RED / Bridge
MES sync
🏭MES (Offline Odoo)
scanning, WO processing
🔴Node-RED / Bridge
print consumer
🖨️CUPS Server
🔴Node-RED / Bridge
machine data
🔧Machine Bridge
PCBA, testers
🏭 Factory Floor
Event flow (publish/subscribe)
Offline event flow
Federation link (store & forward)
Animated message particle
🔐

Credential Isolation by Design

Each Node-RED / bridge instance is a dedicated container scoped to one external service. WISDM credentials live only in the WISDM bridge. Shipping API keys live only in the shipping bridge. No credential sharing across boundaries — each bridge is an external extension of the system it connects to.

What This Changes

Moving from polling to event-driven delivers improvements across every dimension of the system.

~0s

Real-Time Data Flow

Events fire on state change, not on timer. MO updates, print jobs, shipping triggers — all propagate in milliseconds instead of waiting for the next cron cycle.

−70%

Database Load Reduction

Eliminate polling queries on large MES tables. No more scheduled actions scanning for "things to sync" — the database only works when real data flows through.

Fault Tolerance

If WISDM is down, messages queue in RabbitMQ and deliver when it recovers. Dead-letter queues capture failures for replay. No more silent data loss from failed HTTP calls.

Natural Backpressure

During production peaks, the message queue absorbs burst traffic. MES and consumers process at their sustainable rate. Machines never get rejected.

+1

Effortless Extensibility

Need analytics? Subscribe to the bus. New API integration? Add a consumer. No existing code changes — just attach a new listener to existing topics.

🔒

Security Isolation

Each bridge container holds only the credentials for its target system. Compromising one integration exposes zero credentials for others.

Full-Stack Monitoring

With all data flowing through the message bus, we gain a single point of observability across every integration path.

🐇

RabbitMQ Management

Built-in dashboard for queue depths, message rates, consumer health, and federation link status between online and offline brokers.

📈

Grafana + Prometheus

Metrics from all containers — CPU, memory, message throughput, error rates. Custom dashboards for production KPIs and alert thresholds.

🔴

Node-RED Debug

Visual flow debugging for every integration path. See messages flowing in real-time, inspect payloads, and troubleshoot without touching code.

🚨

Alerting

Queue depth thresholds, consumer lag, federation link drops, dead-letter queue entries. Notifications via webhook, email, or messaging.

Side by Side

Before vs. After

DimensionCurrent ArchitectureProposed Architecture
Data flow trigger⏱ Cron timer (every X min)⚡ On event (instant)
External API callsSynchronous, blockingAsync with retry & DLQ
Failure handlingSilent failure, data lossStore-and-forward, replay
DB load patternConstant polling queriesOn-demand, event-triggered
Peak traffic handlingDirect API hit, no bufferQueue absorbs bursts
Adding new consumersModify cron + new endpointsSubscribe to topic
MES print latencyNext poll cycle (sec–min)Sub-second delivery
Offline/online syncCron pull/push, delay + riskFederation, auto store & forward
ObservabilityPer-system logs, manual tracingCentralized metrics & dashboards
Credential securityShared in Odoo server actionsIsolated per bridge container

Incremental Migration

Each phase runs alongside existing scheduled actions. Old cron jobs serve as fallback until the new path is validated, then are decommissioned.

Phase 1 — Foundation

Deploy Brokers & Print Flow

Containerize RabbitMQ (online + offline) with federation configured. Set up Node-RED bridges. Migrate the print job flow first — it's self-contained, high-impact, and proves the entire pattern end to end.

Lowest risk, immediate impact on MES DB
Phase 2 — External APIs

WISDM & Shipping Consumers

Move WISDM device activation and shipping forwarder label requests to async consumers via dedicated bridge containers. Add dead-letter queues for retry logic. Remove blocking HTTP calls from Odoo cron.

Eliminates most fragile failure points
Phase 3 — Core Sync

MES ↔ ERP Event-Driven Sync

Replace the MO/WO pull-and-push scheduled actions with event-driven federation. Odoo publishes state changes, offline broker replicates them to MES, and vice versa for completed work orders.

Largest performance gain, needs careful validation
Phase 4 — Full Bus

Machine Bridge & Monitoring

Route PCBA tester and machine data through the offline message bus. Deploy Grafana + Prometheus stack. All data flows are now observable, buffered, and event-driven. Decommission remaining cron jobs.

Complete architecture migration
Access PIN
Enter your 4-digit PIN to continue