diff options
author | jaseg <git@jaseg.de> | 2022-06-28 17:46:36 +0200 |
---|---|---|
committer | jaseg <git@jaseg.de> | 2022-06-28 17:46:36 +0200 |
commit | 2fd97644bd27654151899f95c5faa54933516d44 (patch) | |
tree | aff8d1748dd612a827ae202e4040cc896a8697c8 /paper/safety-reset-paper.tex | |
parent | a9e12eb9f1fb8780cedec41f8e1b89882af702e1 (diff) | |
download | master-thesis-2fd97644bd27654151899f95c5faa54933516d44.tar.gz master-thesis-2fd97644bd27654151899f95c5faa54933516d44.tar.bz2 master-thesis-2fd97644bd27654151899f95c5faa54933516d44.zip |
paper: add crypto description
Diffstat (limited to 'paper/safety-reset-paper.tex')
-rw-r--r-- | paper/safety-reset-paper.tex | 239 |
1 files changed, 145 insertions, 94 deletions
diff --git a/paper/safety-reset-paper.tex b/paper/safety-reset-paper.tex index e3c3a41..732387a 100644 --- a/paper/safety-reset-paper.tex +++ b/paper/safety-reset-paper.tex @@ -69,7 +69,7 @@ approaches. A core issue with post-attack mitigation is that network connections between the utility and devices on consumer premises may not work due to the attack. Thus, mitigation strategies that involve devices on the consumer premises will need an out-of-band communication channel. -In this paper, we propose a novel, resilient, grid-wide communication technique based on \empH{grid frequency +In this paper, we propose a novel, resilient, grid-wide communication technique based on \emph{grid frequency modulation} (GFM) that can be used to broadcast short messages to all devices connected to the electrical grid. The grid frequency modulation channel is robust and can be used even during an ongoing attack. Based on our channel we propose the \emph{safety reset} controller, an attack mitigation technique that is compatible with most smart meter and IoT @@ -112,15 +112,15 @@ traditional PLC, any large industrial load that allows for fast computer control \label{fig_intro_flowchart} \end{figure} -Figure~\ref{fig_intro_flowchart} shows an overview of our concept, where a large aluminium smelter has been temporarily -re-purposed as a GFM transmitter. Two scenarios for its application are before or during a cyberattack, to stop an -attack on the electrical grid in its tracks, and after an attack while power is being restored to prevent a repeated -attack. In both scenarios, our concept is independent of telecommunication networks (such as the internet or cellular -networks) as well as broadcast systems (such as cable television or terrestrial broadcast radio) while requiring only -inexpensive signal processing hardware and no external antennas (such as are needed for satellite communication). A grid -frequency-based system can function as long as power is still available, or as soon as power is restored after the -attack. One powerful function this allows is ``flushing out`` an attacker from compromised smart meters after an attack, -before restoring smart meter internet connectivity. +Figure~\ref{fig_intro_flowchart} shows an overview of our concept using a smart meter as the target device and a large +aluminium smelter temporarily re-purposed as a GFM transmitter. Two scenarios for its application are before or during +a cyberattack, to stop an attack on the electrical grid in its tracks, and after an attack while power is being restored +to prevent a repeated attack. In both scenarios, our concept is independent of telecommunication networks (such as the +internet or cellular networks) as well as broadcast systems (such as cable television or terrestrial broadcast radio) +while requiring only inexpensive signal processing hardware and no external antennas (such as are needed for satellite +communication). A grid frequency-based system can function as long as power is still available, or as soon as power is +restored after the attack. One powerful function this allows is ``flushing out`` an attacker from compromised smart +meters after an attack, before restoring smart meter internet connectivity. Using simulations we have determined that control of a $\SI{25}{\mega\watt}$ load such as a large aluminium smelter, load bank or photovoltaic farm would allow for the transmission of a crytographically secured safety reset signal within @@ -131,16 +131,16 @@ of decoding such signals on a resource-constrained microcontroller. Consumer devices are increasingly becoming \emph{smart}. Large numbers of IoT devices are connected through the public internet, and in several countries internet-connected Smart Meters can disconnect entire households from the grid in -case of unpaid bills. The increasing proliferation of smart devices on the consumer side presents an opportunity to grid -operators, who rely on forecasts for the cost-optimized control of generation and power flow. The core of the -\emph{Smart Grid} vision is that utilities can now gather detailed data for more accurate consumption forecasts, and in -some cases can even adjust parameters of large devices like water heaters to smooth out load spikes. +case of unpaid bills~\cite{anderson01}. The increasing proliferation of smart devices on the consumer side presents an +opportunity to grid operators, who rely on forecasts for the cost-optimized control of generation and power flow. The +core of the \emph{Smart Grid} vision is that utilities can now gather detailed data for more accurate consumption +forecasts, and in some cases can even adjust parameters of large devices like water heaters to smooth out load spikes. However, this increased degree of visibility and control comes with an increased IT security risk. In this paper we focus on scenarios where an attacker compromises a large number of grid-connected remote-controllable devices. This may -be simple smart home devices such as IoT light bulbs, but it may also include Smart Meters that are outfitted with a -remote disconnect switch as is common in some countries. By rapidly switching large numbers of such devices in a -coordinated manner, the attacker has the opportunity to de-stabilize the electrical +be simple smart home devices such as IoT-connected air conditioners, but it may also include Smart Meters that are +outfitted with a remote disconnect switch as is common in some countries. By rapidly switching large numbers of such +devices in a coordinated manner, the attacker has the opportunity to de-stabilize the electrical grid~\cite{zlmz+21,kgma21,smp18,hcb19}. In this paper, we focus on assisting the recovery procedure after a succesful attack because we estimate that this @@ -169,22 +169,22 @@ This work contains the following contributions: \item We carry out extensive simulations of our systems to determine its performance characteristics. \end{enumerate} -\subsection{Notation} +%\subsection{Notation} % FIXME drop or rework this section ; actually update notation to be consistent throughout -To a computer scientist there is one confusing aspect to the theory of grid frequency modulation. GFM can be seen as a -frequency modulation (FM) with a baseband signal in the band below approximately $f_m = \SI{5}{\hertz}$ that is -modulated on top of a carrier signal at $f_c = \SI{50}{\hertz}$ in case of the European electrical grid. The frequency -deviation $f_\Delta$ that the modulated carrier deviates from its nominal value of $f_m$ is very small at only a few -milli-Hertz. - -When grid frequency is measured by first digitizing the mains voltage waveform, then de-modulating digitally, the FM's -signal-to-noise ratio (SNR) is very high and is dominated by the ADC's quantization noise and nearby mains voltage noise -sources such as resistive droop due to large inrush current of nearby machines. - -Note that both the carrier signal at $f_c$ and the modulation signal at $f_m$ both have unit Hertz. To disambiguate -them, in this paper we will use \textbf{bold} letters to refer to the carrier waveform $\mathbf{U}$ or frequency -$\mathbf{f_c}$ as well as its deviation $\mathbf{f_\Delta}$, and we will use normal weight for the actual modulation -signal and its properties such as $f_m$. +%To a computer scientist there is one confusing aspect to the theory of grid frequency modulation. GFM can be seen as a +%frequency modulation (FM) with a baseband signal in the band below approximately $f_m = \SI{5}{\hertz}$ that is +%modulated on top of a carrier signal at $f_c = \SI{50}{\hertz}$ in case of the European electrical grid. The frequency +%deviation $f_\Delta$ that the modulated carrier deviates from its nominal value of $f_m$ is very small at only a few +%milli-Hertz. +% +%When grid frequency is measured by first digitizing the mains voltage waveform, then de-modulating digitally, the FM's +%signal-to-noise ratio (SNR) is very high and is dominated by the ADC's quantization noise and nearby mains voltage noise +%sources such as resistive droop due to large inrush current of nearby machines. +% +%Note that both the carrier signal at $f_c$ and the modulation signal at $f_m$ both have unit Hertz. To disambiguate +%them, in this paper we will use \textbf{bold} letters to refer to the carrier waveform $\mathbf{U}$ or frequency +%$\mathbf{f_c}$ as well as its deviation $\mathbf{f_\Delta}$, and we will use normal weight for the actual modulation +%signal and its properties such as $f_m$. \section{Background on the electrical grid} \subsection{Components and interactions} @@ -216,7 +216,13 @@ resistance to their source of mechanical power, or \emph{prime mover}, which wou and faster. Similarly, if consumption outpaced production, the increased mechanical load would slow down generators, ultimately leading to a collapse. -The frequency of the electrical grid is maintained at a fixed, stable level through several layers of measures. +On top of the grid's inherent mechanical inertia, several tiers of control systems are layered to stabilize mains +frequency during day-to-day operations. Fast-acting automatic primary control stabilizes temporary frequency excursions, +while slower automatic secondary control and manual tertiary control re-adjust device's operating points back to their +nominal values after they have shifted due to primary control action. + +In day-to-day operation, the frequency of the electrical grid is maintained at a +fixed, stable level through several layers of control systems. \subsection{Black-start recovery} @@ -264,19 +270,18 @@ meters, smart meters can provide near-realtime data that the utility can use for \subsection{Powerline Communication (PLC)} -A core issue in smart metering is the communication channel from the meter to the greater world. Smart meters are -cost-constrained devices, which limits the use of landline internet or cellular conenctions. Additionally, electricity -meters are often installed in basements, far away from the customer's router and with soil and concrete blocking radio -signals. For these reasons, in some AMI deployments, powerline communication (PLC) has been chosen for the meters' -uplink. +A core issue in smart metering and demand-side response is the communication channel from the meter to the greater +world. Smart meters are cost-constrained devices, which limits the use of landline internet or cellular conenctions. +Additionally, electricity meters are often installed in basements, far away from the customer's router and with soil and +concrete blocking radio signals. For these reasons, in some AMI deployments, powerline communication (PLC) has been +chosen for the meters' uplink. Since the early days of the electrical grid, powerline communication has been used to control devices spread throughout the grid from a central transmitter~\cite{rs48}. PLC systems super-impose a modulated high-frequency signal on top of the grid voltage. When the carrier frequency of this modulation is in the audible frequency range, low data rates can be transmitted over distances of several tens of kilometers. By using a radio frequency carrier, higher data rates can be achieved across shorter distances. Audio frequency PLC, called ``ripple control'', is still used today by utilities to -enable ``demand-side response'', i.e.\ the remote switching of loads such as water heaters to avoid times of peak -electricity demand. +enable demand-side response, by remotely switching on and off water heaters to avoid times of peak electricity demand. Usually, such powerline communication systems are uni-directional but they are instance of bi-directional powerline communication for smart meter reading such as the italian smart meter deployment~\cite{ec03,rs48,gungor01,agf16}. @@ -287,7 +292,7 @@ communication for smart meter reading such as the italian smart meter deployment \subsection{IoT and Smart Grid security} The security of IoT devices as well as the smart grid has received extensive attention in the -literature~\cite{nbck+19,acsc20,smp18,ykll17,anderson01,anderson02,zlmz21,kgma21,hcb19,mpdm10,lzlw+20,chl20,lam21,olkd20,yomu+20,}. +literature~\cite{nbck+19,acsc20,smp18,ykll17,anderson01,anderson02,zlmz+21,kgma21,hcb19,mpdm+10,lzlw+20,chl20,lam21,olkd20,yomu+20}. The challenges of IoT device security and the security of smart meters and other smart grid devices are similar because smart grid devices are essentially IoT devices in a particularly sensitive location~\cite{acsc20}. In both device types, the challenge is that securing embedded firmware is difficult, and adding network interfaces and cost constraints only @@ -333,8 +338,8 @@ cause outsized damage. Electro-mechanical oscillation modes between different geographical areas of an electrical grid are a well-known phenomenon. In their book~\cite{rogers01}, Rogers and Graham provide an in-depth analysis of these oscillations and -their mitigation. In~\cite{grebe01}, Grebe, Kabouris, López Barba et al.\ analyzed modeskj inherent to the -continental european grid. A report on an event where an oscillation on one such mode caused a problem can be found in +their mitigation. In~\cite{grebe01}, Grebe, Kabouris, López Barba et al.\ analyzed modes inherent to the +continental European grid. A report on an event where an oscillation on one such mode caused a problem can be found in \cite{entsoe01}. In~\cite{zlmz+21}, Zou, Liu, Ma et al.\ analyzed the possibility of a modal attack in which electric vehicle chargers @@ -401,17 +406,17 @@ receiver hardware complexity. To the best of the authors' knowledge, grid frequency modulation has only ever been proposed as a communication channel at very small scales in microgrids before~\cite{urtasun01} and has not yet been considered for large-scale application. -Compared to traditional channels such as DSL, LTE or LoraWAN, grid frequency as a communication channel has a resiliency -advantage: If there is power, a grid frequency modulation system is operational. Both DSL and LTE systems not only -require power at their base stations, but also require centralized infrastructure to operate. Mesh networks such as -LoraWAN can cover short distances up to $\SI{20}{\kilo\meter}$ without requiring infrastructure to be available, but for -longer distances LoraWAN relies on the public internet for its network backbone. Additionally, systems such as DSL, LTE -and LoraWAN are built around a point-to-point communication model and usually do not support a generic broadcast -primitive. During times when a large number of devices must be reached simultaneously this can lead to congestion of -cellular towers and servers. Therefore, during an ongoing cyberattack, grid frequency is promising as a communication -channel because only a single transmitter facility must be operational for it to function, and this single transmitter -can reach all connected devices simultaneously. After a power outage, it can resume operation as soon as electrical -power is restored, even while the public internet and mobile networks are still offline. It is unaffected by +Compared to traditional channels such as Fiber To The Home (FTTH), 5G or LoraWAN, grid frequency as a communication +channel has a resiliency advantage: If there is power, a grid frequency modulation system is operational. Both FTTH and +5G systems not only require power at their base stations, but also require centralized infrastructure to operate. Mesh +networks such as LoraWAN can cover short distances up to $\SI{20}{\kilo\meter}$ without requiring infrastructure to be +available, but for longer distances LoraWAN relies on the public internet for its network backbone. Additionally, +systems such as FTTH, 5G and LoraWAN are built around a point-to-point communication model and usually do not support a +generic broadcast primitive. During times when a large number of devices must be reached simultaneously this can lead to +congestion of cellular towers and servers. Therefore, during an ongoing cyberattack, grid frequency is promising as a +communication channel because only a single transmitter facility must be operational for it to function, and this single +transmitter can reach all connected devices simultaneously. After a power outage, it can resume operation as soon as +electrical power is restored, even while the public internet and mobile networks are still offline. It is unaffected by cyberattacks that target telecommunication networks. \subsection{Characterizing Grid Frequency} @@ -458,7 +463,7 @@ this $1/f$ behavior, the spectrum shows several sharp peaks at time intervals wi $\SI{10}{\second}$, $\SI{60}{\second}$ or multiples of $\SI{300}{\second}$. These peaks are due to loads turning on- or off depending on wall-clock time, and demand forecasting not being able to precisely match the amplitude of these large changes in load. Besides the narrow peaks caused by this effect we can also observe two wider bumps at -$\SI{7.0}{\second}$ and $\SI{4.7}{\second}$. These bumps closely correlate with continental european synchonous area's +$\SI{7.0}{\second}$ and $\SI{4.7}{\second}$. These bumps closely correlate with continental European synchonous area's oscillation modes at $\SI{0.15}{\hertz}$ (east-west) and $\SI{0.25}{\hertz}$ (north-south)~\cite{grebe01}. \section{Grid Frequency Modulation} @@ -470,10 +475,10 @@ thyristor rectifier bank. Compared to this baseline solution, hardware and maint by repurposing a large industrial load as a transmitter. Going through a list of energy-intensive industries in Europe~\cite{ec01}, we found that an aluminium smelter would be a good candidate. In aluminium smelting, aluminium is electrolytically extracted from alumina solution. High-voltage mains power is -transformed, rectified and fed into about 100 series-connected electrolytic cells forming a \emph{potline}. Inside these -pots, alumina is dissolved in molten cryolite electrolyte at about \SI{1000}{\degreeCelsius} and electrolysis is -performed using a current of tens or hundreds of Kiloampère. The resulting pure aluminium settles at the bottom of the -cell and is tapped off for further processing. +transformed, rectified and fed into approximately 100 series-connected electrolytic cells forming a \emph{potline}. +Inside these pots, alumina is dissolved in molten cryolite electrolyte at approximately \SI{1000}{\degreeCelsius} and +electrolysis is performed using a current of tens or hundreds of Kiloampère. The resulting pure aluminium settles at the +bottom of the cell and is tapped off for further processing. Aluminium smelters are operated around the clock, and due to the high financial stakes their behavior under power outages has been carefully characterized. Power outages of tens of minutes up to two hours reportedly do not cause @@ -502,10 +507,10 @@ parts of the plant, as this is commonplace during routine maintenance activities Given the grid characteristics we measured using our custom waveform recorder and using a model of our transmitter, we can derive parameters for the modulation of our broadcast system. The overall network power-frequency characteristic of -the continental European synchronous area is about $\SI{25}{\giga\watt\per\hertz}$~\cite{entsoe02}. Thus, the main -challenge for a GFM system will be poor signal-to-noise ratio (SNR) due to low transmission power. A second layer of -modulation yielding some modulation gain beyond the basic amplitude modulation of the transmitter will be necessary to -achieve sufficient overall SNR. +the continental European synchronous area is approximately $\SI{25}{\giga\watt\per\hertz}$~\cite{entsoe02}. Thus, the +main challenge for a GFM system will be poor signal-to-noise ratio (SNR) due to low transmission power. A second layer +of modulation yielding some modulation gain beyond the basic amplitude modulation of the transmitter will be necessary +to achieve sufficient overall SNR. The grid's frequency noise has significant localized peaks that might interfere with this modulation. Further complicating things are the oscillation modes. A GFM system must be designed to avoid exciting these modes. However, @@ -595,15 +600,56 @@ correction~\cite{mackay01} and some cryptography. The goal of our PoC cryptograp sender of an emergency reset broadcast to authorize a reset command to all listening smart meters. An additional constraint of our setting is that due to the extremely slow communication channel all messages should be kept as short as possible. The solution we chose for our PoC is a simplistic hash chain using the approach from the Lamport and -Winternitz One-time Signature (OTS) schemes. Informally, the private key is a random bitstring. The public key is -generated by recursively applying a hash function to this key a number of times. Each smart meter reset command is then -authorized by disclosing subsequent elements of this series. Unwinding the hash chain from the public key at the end of -the chain towards the private key at its beginning, at each step a receiver can validate the current command by checking -that it corresponds to the previously unknown input of the current step of the hash chain. Replay attacks are prevented -by recording the most recent valid command. Keys revocation is supported by designating the last key in the chain as a -\emph{revocation key} upon whose reception the client devices advance their local hash ratchet without taking further -action. This simple scheme does not afford much functionality but it results in very short messages and removes the -need for computationally expensive public key cryptography inside the smart meter. +Winternitz One-time Signature (OTS) schemeS~\cite{lamport02,merkle01}. Informally, the private key is a random +bitstring. The public key is generated by recursively applying a hash function to this key a number of times. Each smart +meter reset command is then authorized by disclosing subsequent elements of this series. Unwinding the hash chain from +the public key at the end of the chain towards the private key at its beginning, at each step a receiver can validate +the current command by checking that it corresponds to the previously unknown input of the current step of the hash +chain. Replay attacks are prevented by the device memorizing the most recent valid command. Keys revocation is supported +by designating the last key in the chain as a \emph{revocation key} upon whose reception the client devices advance +their local hash ratchet without taking further action. This simple scheme does not afford much functionality but it +results in very short messages and removes the need for computationally expensive public key cryptography inside the +smart meter. + +Formally, we can describe our simple cryptographic protocol as follows. Given an $n$-bit cryptographic hash function $H +: \{0,1\}^*\rightarrow\{0,1\}^m$ and a private key $k_0 \in \{0,1\}^m$, we construct the public key as +$k_{n_\text{total}} = H^{n_\text{total}}(k_0)$ where $H^n(x)$ denotes the $n$-times recursive application of $H$ to +itself, i.e.\ $\underbrace{H(H(\hdots H(}_{n\;\text{times}}x)))$. $q$ is the total number of signatures that the system can +issue. $n_\text{total}$ must be chosen with adequate safety margin to account for unpredictable future use of the +system. The choice of $n_\text{total}$ is of no consequence when a device checks reset authorization, but key generation +time grows linearly with $n_\text{total}$ since $H$ needs to applied $n_\text{total}$ times. In practice, given the +speed of modern computers, values of $n_\text{total} > 10^9$ should pose no problem during key generation. For public +key $k_{n_\text{total}}$, the system can authorize up to $n_\text{total}$ commands by successively disclosing the $k_i$ +starting at $i=n-1$ and counting down until finally disclosing $k_0$. Since we only want to transmit a single bit of +information, we do not need any payload. Instead, we simply send a message $m = (k_i)$ consisting solely of $k_i$. The +receiver of a message $m$ can check that the message is a legitimate command by checking $\exists i<q: H^i(m) = +k_\text{last}$ where $k_\text{last}$ is the last valid command that was received. $q$ is the maximum lookup depth that +the device will accept as valid. To conserve processing power, $q$ should be chosen to be much smaller than +$n_\text{total}$. Choosing $q$ too small, a device might become out of sync with the transmitter when it is disconnected +from the electrical grid for a long enough time for at least $q$ commands to be issued in the meantime. In practice, +this should not be a concern since only few commands should be issued over the life time of the system. + +During an emergency situation, not all safety reset controllers might be online at the same time. In case the electrical +grid is restored piece by piece with safety reset controllers coming back online in batches, an utility might repeatedly +transmit the same reset command. In our protocol, we handle this situation by memorizing the last valid received command +on the device side, and only acting \emph{once} when a new command is received. The transmission of one command thus +becomes idempotent, and the utility can repeat the command until sufficiently many devices have received the command and +e.g.\ performed a safety reset. + +In our protocol, we define two commands, \emph{reset} and \emph{disarm}. We assign \emph{reset} and \emph{disarm} to the +$k_i$ alternatingly. For odd $i$, $k_i$ is a reset command and for even $i$, $k_i$ is a \emph{disarm} command. To +trigger a safety reset, the utility transmits the next unused $k_{2i+1}$. The utility may transmit this command repeatedly +to also reset devices that have come online only after earlier transmissions have started. After a sufficient number of +devices have performed a safety reset, the utility then transmits the next disarm command, $k_{2i}$. When devices +receive the disarm command, they still update the last received command, but they do not perform any other action. + +The reason for interleaving two commands in this way is to prevent a specific attack scenario in which an attacker first +observes a safety reset command being transmitted, and then at a later time gains access to a large load that could act +as a grid frequency modulation transmitter. Without a \emph{disarm} command, this attacker could then later trigger a +safety reset in any device that has not received the original reset command yet. The \emph{disarm} command gives the +utility the option to revoke a prior \emph{reset} command before any devices that were offline during the original reset +without triggering them to reset. + % FIXME add more precise/formal description of crypto % FIXME add description of targeting/scope function? % FIXME somewhere above descirbe entire reset system architecture????!!! @@ -667,12 +713,12 @@ signal unmodulated noise on both ends. For our proof of concept, before settling on the commercial smart meter we first tried to use an \texttt{EVM430-F6779} smart meter evaluation kit made by Texas Instruments. This evaluation kit did not turn out well for two main reasons. -One, it shipped with half the case missing and no cover for the terminal blocks. Because of this some work was required -to get it electrically safe. Even after mounting it in an electrically safe manner the safety reset controller -prototype would also have to be galvanically isolated to not pose an electrical safety risk since the main MCU is not -isolated from the grid and the JTAG port is also galvanically coupled. The second issue we ran into was that the -development board is based around a specific microcontroller from TI's \texttt{MSP430} series that is incompatible with -common JTAG programmers. +One, it shipped with half the case missing and no cover for the high-voltage terminal blocks. Because of this some work +was required to get it electrically safe. Even after mounting it in an electrically safe manner the safety reset +controller prototype would also have to be galvanically isolated to not pose an electrical safety risk since the main +MCU is not isolated from the grid and the JTAG port is also galvanically coupled. The second issue we ran into was that +the development board is based around a specific microcontroller from TI's \texttt{MSP430} series that is incompatible +with common JTAG programmers. Our initial assumption that a development kit would be easier to program than a commercial meter did not prove to be true. Contrary to our expectations the commercial meter had JTAG enabled allowing us to easily read out its stock @@ -689,16 +735,6 @@ implementation has no issues processing data in real-time due to the low samplin \section{Conclusion} \label{sec_conclusion} -\subsection{Applicability to IoT devices} - -\subsection{Discussion} -During an emergency in the electrical grid, the ability to communicate to large numbers of end-point devices is a -valuable tool for restoring normal operation. When a resilient communcation channel is available, loads such as smart -meters and IoT devices can be equipped with a supervisor circuit that allows for a remote ``safety reset'' that puts the -device into a safe operating state. Using this safety reset, an attacker that uses compromised smart meters or IoT -devices to attack grid stability can be interrupted before the can conclude their attack. During recovery from an -outage, a safety reset can be used to reduce stress on the system during a black start by temporarily disabling -non-essential loads such as air conditioners. In this paper we have developed an end-to-end design for a safety reset system that provides these capabilities. Our novel broadcast data transmission system is based on intentional modulation of global grid frequency. Our system is @@ -716,10 +752,28 @@ frequency data to trigger a commercial microcontroller to perform a firmware res next step in our evaluation will be to conduct an experimental evaluation of our modulation scheme in collaboration with an utility and an operator of a multi-megawatt load. +\subsection{Discussion} + +During an emergency in the electrical grid, the ability to communicate to large numbers of end-point devices is a +valuable tool for restoring normal operation. When a resilient communcation channel is available, loads such as smart +meters and IoT devices can be equipped with a supervisor circuit that allows for a remote ``safety reset'' that puts the +device into a safe operating state. Using this safety reset, an attacker that uses compromised smart meters or IoT +devices to attack grid stability can be interrupted before the can conclude their attack. During recovery from an +outage, a safety reset can be used to reduce stress on the system during a black start by temporarily disabling +non-essential loads such as air conditioners. + The safety reset controller does not require any peripherals except for an ADC. Thus we expect code size to be the main factor affecting per-unit cost in an in-field deployment of our concept. At around \SI{64}{\kilo\byte}, our demonstrator -firmware implementation is viable on low-end microcontrollers. Thus, we expect safety reset controllers to be -commercially viable. +firmware implementation is viable on low-end microcontrollers. Given that modern smart meters and IoT devices usually +use complex Systems on Chip (SoCs), a safety reset controller could be integrated into the main application processor +itself at little added complexity. In summary, we expect safety reset controllers to be commercially viable. + +Safety reset controllers can be adapted to most IoT device and smart meter designs. Because they are independent from +other pubilc utilities such as the internet or cellular networks, we believe in their potential as a last line of +defense providing resilience under large-scale cyberattacks. The next steps towards a practical implementation will be +a practical demonstration of broadcast data transmission through grid frequency modulation using a megawatt-scale +controllable load as well as further optimization of the modulation and data encoding as well as the demodulator +implementation. Source code and EDA designs are available at the public repository listed at the end of this document. @@ -732,9 +786,6 @@ Source code and EDA designs are available at the public repository listed at the \center{ \footnotesize - \center{This is version \texttt{\input{version.tex}\unskip} of this paper, generated on \today. The git repository - can be found at:} - - \center{\url{https://git.jaseg.de/safety-reset.git}} + \center{This is version \texttt{\input{version.tex}\unskip} of this paper, generated on \today.} } \end{document} |