diff options
-rw-r--r-- | paper/safety-reset-paper.tex | 104 |
1 files changed, 59 insertions, 45 deletions
diff --git a/paper/safety-reset-paper.tex b/paper/safety-reset-paper.tex index f76f4af..f89aeae 100644 --- a/paper/safety-reset-paper.tex +++ b/paper/safety-reset-paper.tex @@ -30,6 +30,8 @@ \begin{document} +% https://eepublicdownloads.entsoe.eu/clean-documents/pre2015/publications/entsoe/Operation_Handbook/Policy_1_Appendix%20_final.pdf + \title{Ripples in the Pond: Transmitting Information through Grid Frequency Modulation} \titlerunning{Ripples in the Pond: Transmitting Information through Grid Frequency} \author{Jan Sebastian Götte \and Liran Katzir \and Björn Scheuermann} @@ -46,14 +48,16 @@ things'' \and Cyber-physical systems \and Hardware security \and Network Securit perfect security against such cyberphysical attacks is a monumental embedded engineering task---and observations do not indicate that current efforts meet the requirements of this task.%FIXME cite recent RECESSIM work - In this paper, we approach the smart grid safety issue by implementing an emergency override that can be used to - reset all connected devices to a known-good state and preempt subsequent compromise by cutting communication links. - To yield a fully fail-safe design, our system does not rely on the internet or other conventional telecommunication - networks to function. Instead, our system transmits error-corrected and cryptographically secured commands by - modulating grid frequency using a single large consumer such as a large aluminium smelter. This approach differs - from traditional Powerline Communication (PLC) systems in that it reaches every device within the same synchronous - area as the signal is embedded into the fundamental grid frequency instead of a superimposed voltage that is quickly - attenuated across long distances. + In this paper, we approach the smart grid safety issue by introducing a new, resilient broadcast communication + channel based on modulating grid frequency that can be used as a last resort during large-scale cyberattacks. To + demonstrate this channel, we have implementing an emergency override that can be used to reset potentially + compromised smart meters to a known-good state and preempt subsequent compromise by cutting communication links. + Our system transmits error-corrected and cryptographically secured commands by modulating grid frequency using a + single large consumer such as a large aluminium smelter. This approach differs from traditional Powerline + Communication (PLC) systems in that it reaches every device within the same synchronous area as the signal is + embedded into the fundamental grid frequency instead of a superimposed voltage that is quickly attenuated across + long distances. The system only requires a single transmitting station anywhere on the grid and as such can operate + fully independent of public telecommunication infrastructure. Using simulations we have determined that control of a $\SI{25}{\mega\watt}$ load would allow for the transmission of a crytographically secured \emph{reset} signal within $15$ minutes. We have designed and constructed a @@ -112,22 +116,33 @@ Mitigation of these attacks through firmware security measures is unlikely to yi complexity of smart meter firmware makes firmware security extremely labor intensive. The diverse standardization landscape makes a coordinated, comprehensive response unlikely. -In this paper, instead of focusing on the very hard task of improving firmware security we introduce a pragmatic -solution to the---in our opinion likely---scenario of a large-scale compromise of smart meter firmware. In our concept -the components of the smart meter that are threatened by remote compromise are equipped with a physically separate -\emph{safety reset controller} that listens for a ``reset'' command transmitted through the electrical grid's frequency -and on reception forcibly resets the smart meter's entire firmware to a known-good state and disables all network -functionality to prevent re-compromise. Our safety reset controller receives commands through Direct Sequence Spread -Spectrum (DSSS) modulation carried out on grid frequency through a large controllable load such as an aluminium smelter. -After forward error correction and cryptographic verification it re-flashes the meter's main microcontroller over the -standard JTAG interface. Note that our modulation technique is \emph{changing the grid frequency itself}. This is +In this paper, we introduce \emph{Grid Frequency Modulation}, a new communication channel that can be used for grid-wide +broadcast without relying on any other telecommunication networks being operational. Grid Frequency Modulation uses +Direct Sequence Spread Spectrum (DSSS) modulation carried out on grid frequency through a large controllable load such +as an aluminium smelter. Note that Grid Frequency Modulation is \emph{changing the grid frequency itself}. This is fundamentally different in both generation and detection from systems such as traditional PLC that superimpose a signal -on grid voltage, but leave the underlying grid frequency itself unaffected. The safety reset controller is an -off-the-shelf microcontroller much smaller than the one used for the meter's main application controller. It measures -grid frequency from a voltage waveform acquired using its internal analog-to-digital-converter (ADC) directly connected -to the mains voltage input through a resistive divider chain. By using of an off-the-shelf microcontroller we keep the -implementation overhead of our solution low in engineering cost compared to an ASIC. By keeping its firmware small, we -can use a simpler and less expensive microcontroller, keeping per-unit implementation cost low. +on grid voltage, but leave the underlying grid frequency itself unaffected. As it requires high-fidelity control over a +large load or producer connected to the grid, Grid Frequency Modulation provides a degree of implicit sender +authentication. + +To illustrate the utility of Grid Frequency Modulation we propose a pragmatic solution to the---in our opinion +likely---scenario of a large-scale compromise of smart meter firmware. Instead of improving firmware security or +resiliency of public telecommunication infrastructure, both of which are hard problems, we introduce the \emph{safety +reset controller} as a fail-safe that allows an utility to flush an attacker out of their deployed smart meters even +during large-scale disruption of telecommunication networks. In our concept the components of the smart meter that are +threatened by remote compromise are equipped with a physically separate microcontroller that listens for a ``reset'' +command transmitted through the electrical grid's frequency and on reception forcibly resets the smart meter's entire +firmware through a low-level programming interface such as JTAG to a known-good state that disables all network +functionality to prevent re-compromise and lock out the attackers until the device can be programmed with a patched +firmware by a service technician. As part of our prototype reset controller we have developed a simple cryptographic +command protocol based on the Lamport and Winternitz One-time Signature (OTS) schemes that our prototype reset +controller uses to receive an authenticated command to re-flashe the smart meter's main microcontroller over the +standard JTAG interface. The safety reset controller is an off-the-shelf microcontroller much smaller than the one used +for the meter's main application controller. To receive grid frequency-modulated commands, it measures grid frequency +from a voltage waveform acquired using its internal analog-to-digital-converter (ADC) directly connected to the mains +voltage input through a resistive divider chain. By using of an off-the-shelf microcontroller we keep the implementation +overhead of our solution low in engineering cost compared to an ASIC. By keeping its firmware small, we can use a +simpler and less expensive microcontroller, keeping per-unit implementation cost low. \begin{figure} \centering @@ -338,11 +353,15 @@ Compared to traditional channels such as DSL, LTE or LoraWAN, grid frequency as resiliency advantage: If there is power, a grid frequency modulation system is operational. Both DSL and LTE systems not only require power but also require large amounts of centralized infrastructure to operate. Mesh networks such as LoraWAN can cover short distances up to $\SI{20}{\kilo\meter}$ without requiring infrastructure to be available, but for -longer distances LoraWAN relies on the public internet for its network backbone. Therefore, during an ongoing -cyberattack, grid frequency is promising as a communication channel as only a single transmitter facility must be -operational for it to function. After a power outage, it can function as soon as electrical power is restored, even -while the public internet and mobile networks are still offline and it is unaffected by cyberattacks that target -telecommunication networks. +longer distances LoraWAN relies on the public internet for its network backbone. Additionally, systems such as DSL, LTE +and LoraWAN are built around a point-to-point communication model and usually do not support a generic broadcast +primitive. During times when a large number of devices must be reached simultaneously this can lead to congestion of +local cellular towers or gateways. +Therefore, during an ongoing cyberattack, grid frequency is promising as a communication channel as only a single +transmitter facility must be operational for it to function, and this single transmitter can reach all connected devices +simultaneously. After a power outage, it can function as soon as electrical power is restored, even while the public +internet and mobile networks are still offline and it is unaffected by cyberattacks that target telecommunication +networks. \subsection{Characterizing Grid Frequency} \label{grid-freq-characterization} @@ -476,8 +495,10 @@ generated by recursively applying a hash function to this key a number of times. authorized by disclosing subsequent elements of this series. Unwinding the hash chain from the public key at the end of the chain towards the private key at its beginning, at each step a receiver can validate the current command by checking that it corresponds to the previously unknown input of the current step of the hash chain. Replay attacks are prevented -by recording the most recent valid command. This simple scheme does not afford much functionality but it results in very -short messages and removes the need for computationally public key cryptography inside the smart meter. +by recording the most recent valid command. Keys revocation is supported by designating the last key in the chain as a +\emph{revocation key} upon whose reception the client devices advance their local hash ratchet without taking further +action. This simple scheme does not afford much functionality but it results in very short messages and removes the +need for computationally expensive public key cryptography inside the smart meter. % FIXME add more precise/formal description of crypto % FIXME add description of targeting/scope function? % FIXME somewhere above descirbe entire reset system architecture????!!! @@ -560,25 +581,17 @@ decoder, grid frequency estimation) proved to be very useful. In particular debu to run several thousand tests within seconds. In case of our DSSS demodulator, this modular testing and simulation architecture allowed us to simulate thousands of runs of our implementation on test data and directly compare it to our Jupyter/Python prototype. Since we spent more time polishing our embedded C implementation it turned out to perform -better than our Python prototype while still exhibiting the same fundamental response to changes to its parameters. One -significant bug we fixed in the embedded C version was the Python version's tendency towards incorrect decodings at even -very large amplitudes. +better than our Python prototype while still exhibiting the same fundamental response to changes to its parameters. In accordance with our initial estimations we did not run into any code space nor computation bottlenecks for chosing floating point emulation instead of porting over our algorithms to fixed point calculations. The extremely slow sampling rate of our systems makes even heavyweight processing such as FFT or our brute force dynamic programming approach to DSSS demodulation possible well within our performance constraints. -Since we are only building a prototype we did not optimize firmware code size. Since we do not require any peripherals -except for an ADC and since our code is not speed-constrained, code size is likely to be the main factor affecting -per-unit cost in an in-field deployment of our concept. With this in mind, at around \SI{64}{\kilo\byte}, the compiled -code size of our demonstrator firmware implementation is slightly larger than we would like. The overall most -heavy-weight operations are the SHA512 implementation from libsodium and the FFT from ARM's CMSIS signal processing -library. Especially the SHA512 implementation has large potential for size optimization because it is highly optimized -for speed using extensive manual loop unrolling. Despite being larger than what we initially targeted, this firmware is -still small compared to the firmware space available in commercially deployed smart meters. We estimate that even -without additional optimizations, our PoC firmware is already within the realm of firmware size that could be -implemented in a commercially viable safety reset controller. +The safety reset controller does not require any peripherals except for an ADC. Thus we expect code size to be the main +factor affecting per-unit cost in an in-field deployment of our concept. At around \SI{64}{\kilo\byte}, our unoptimized +demonstrator firmware implementation is already on the lower end of the spectrum. Especially with some optimization we +expect safety reset controllers to be commercially viable given adequate political incentives. \section{Conclusion} \label{sec_conclusion} @@ -594,8 +607,9 @@ would be a feasible way to set up a transmitter with low hardware overhead. We protocol ready for embedded implementation in resource-constrained systems that allows triggering a safety reset with a response time of less than 30 minutes. We have experimentally validated our system using simulated grid frequency data in a demonstrator setup based on a commercial microcontroller as our safety reset controller and an off-the-shelf smart -meter. Source code and electronics CAD designs are available at the public repository listed at the end of this -document. +meter. The next step in our evaluation will be to conduct an experimental evaluation of our modulation scheme in +collaboration with an utility and an operator of a multi-megawatt load. Source code and electronics CAD designs are +available at the public repository listed at the end of this document. \printbibliography[heading=bibintoc] |