In automation and BMS, the problem is rarely “does LTE/5G work,” but does it work predictably. Coverage can fluctuate, base stations can hand off unexpectedly, and carriers introduce policies that look like random outages from the outside. That’s why two concepts keep appearing in OT projects: Dual SIM and WAN failover.

This article answers one practical question: when does link redundancy truly save a project, and when is it just a “nice feature on a datasheet.” We’ll focus on the mechanics, real failure scenarios, and common configuration mistakes that make failover exist “on paper,” but fail in a real incident.

If high link availability is important at your site
Describe your site type and required continuity level (e.g., SCADA 24/7, remote service, alarms). We’ll help select the right model and failover logic. Contact a Consteel Electronics expert.

Why Dual SIM in an industrial router

Dual SIM in an industrial router is the simplest way to reduce carrier-related risk. In an ideal world, LTE/5G outages don’t exist. In the real world you get: BTS congestion, network upgrades, backhaul failures, “coverage holes” at certain times and locations, and sometimes degraded signal quality due to environmental changes (e.g., new buildings).

A second SIM from a different carrier works like insurance: when one network has trouble, the router switches to the other. In practice, the key point is that redundancy should not be “two SIMs in the same network”, but genuinely independent paths: different carriers, different radio infrastructure, and often different backhaul routes.

Worth distinguishing: Dual SIM vs “two SIM slots”
Not every router with two SIM slots provides meaningful switching automation. What matters is: how it detects failure, how fast it switches, whether it returns to the primary SIM, and whether it can keep VPN access stable after the path changes.

What WAN failover is and how it works

WAN failover is the logic that shifts traffic to a backup link when the primary link no longer meets operating conditions. Important: “the link is down” does not always mean no coverage. In automation, you more often see cases where the modem is attached to the network, but IP traffic does not pass correctly (routing issues, congestion, carrier core problems, DNS failures, APN issues).

Proper failover is based on measuring link health rather than “do we have signal bars.” The router runs periodic checks (health-check) and decides whether to stay on the primary link or switch to the backup. Depending on the design, the backup can be: the second SIM, a second modem, a second carrier, or even a wired WAN (Ethernet).

Failback and health-check – the details that matter

Two things most commonly break failover in practice: bad health-check design and no sensible failback. Health-check must detect a real IP connectivity loss, not just “interface up/down.” Failback is the logic that returns to the primary link once it is stable again – with hysteresis, so the router doesn’t “bounce” between links every minute.

  • Health-check: test more than one endpoint (e.g., multiple hosts) and don’t rely on DNS alone.
  • Thresholds and timers: set sensible delays to avoid switching on short fluctuations.
  • Failback: return to the primary link only after a stability window, not “immediately after the first successful check”.
  • VPN: verify that after a path change the tunnel re-establishes automatically and routing policies remain consistent.
A classic in distributed sites
“The router switched to SIM2, but VPN didn’t come back” – most often caused by routes/policies, NAT behavior, or overly aggressive timers. Failover is not only about switching the modem; it’s about preserving the logical access path.

When it’s essential vs “nice to have”

Dual SIM and WAN failover are essential where loss of connectivity has real costs: downtime, missing alarms, missing billing/settlement data, risk of process parameters drifting out of range, or the need for an immediate service trip. They’re “nice to have” when connectivity is only auxiliary and an outage doesn’t create time pressure.

ScenarioRisk during a link outageRecommendation
SCADA 24/7 (distributed sites)No alarms, no data, delayed responseDual SIM + failover + monitoring
Remote PLC/HMI serviceService trip, longer repair timeFailover (Dual SIM for critical locations)
BMS (monitoring and alarms)Missing notifications, delayed action riskDual SIM when SLA is high
Auxiliary data / low criticalityLowSimpler setup, but with VPN and meaningful diagnostics

How to choose carriers and SIMs for real redundancy

Redundancy only makes sense when paths are truly independent. In practice that means: two different carriers, and ideally validating their behavior at the specific location (different times of day, different conditions). If the site is radio-challenging, an external antenna and proper cable routing can be a better investment than “another SIM card.”

  • Choose SIMs with stability in mind, not only data volume.
  • Watch out for CGNAT limitations and remote-access requirements (VPN).
  • If possible, test link behavior under load and during peak hours.

Common deployment mistakes that make failover fail

  • “Dry” testing – switching works, but VPN, routing, and OT services weren’t verified after the switch.
  • DNS-only health-check – when DNS hiccups, failover switches unnecessarily, or misses a real IP issue.
  • Overly aggressive timers – the router bounces between links during short fluctuations.
  • No failback or no hysteresis – no return to primary, or “ping-pong” between WAN links.
  • Two SIMs with one carrier – “redundancy” without real independence.
  • Ignoring antennas – router inside a metal cabinet without a proper external antenna.

Testing and monitoring: how to verify failover really works

The simplest test is “pull the SIM and see.” That’s a good start, but it’s not enough. Real outages don’t look like a removed SIM. More often you get: LTE is attached, but traffic is congested or VPN loses stability. That’s why testing should include:

  • Health-check validation – does it detect IP problems, not only interface status.
  • VPN tunnel test – does the tunnel come back automatically and keep correct routes after switching.
  • OT service test – after switching, do PLC/HMI/SCADA paths work, not just “internet access.”
  • Monitoring – switch logs, link-quality alerts, event history.

FAQ – Dual SIM and WAN failover in practice

Does Dual SIM mean both SIMs work at the same time?
Usually not – most setups use one SIM as active and the other as standby. What matters is automatic switching and a sensible return-to-primary policy.
How does a router detect “the link is down” if the modem is still attached?
Through health-check: host/route reachability tests that confirm real IP connectivity, not just interface status.
Will failover break the VPN?
It can if configuration and routing aren’t prepared for WAN changes. In practice you need to validate automatic tunnel re-establishment and routing policies.
Do two SIMs from the same carrier make sense?
Usually only marginally. It reduces SIM card failure risk, but it does not protect against carrier network outages or congestion.
What matters more: Dual SIM or a good antenna?
It depends on the location. With weak signal, an antenna can fix the root cause, while Dual SIM protects against carrier outages and congestion.
How fast should failover switch?
It depends on checks and timers. Fast switching is good, but overly aggressive settings can cause “ping-pong” during short fluctuations.

Summary: link redundancy is a process, not a checkbox in a datasheet

Dual SIM and WAN failover can save a project, but only when they’re designed and tested end-to-end: health-check detects real IP issues, failback has hysteresis, and VPN plus routing are ready for a path change. In critical sites, this is one of the most cost-effective ways to increase availability and reduce service trips.

In the next article we’ll go one step further: “Industrial router – how to choose (LTE/5G, VPN, Dual SIM, failover)” and break selection down into practical criteria that matter in automation.