diff options
Diffstat (limited to 'ma')
-rwxr-xr-x | ma/resources/grid_meas_device_front.jpg | bin | 0 -> 123994 bytes | |||
-rwxr-xr-x | ma/resources/grid_meas_device_open.jpg | bin | 0 -> 242492 bytes | |||
-rwxr-xr-x | ma/resources/prototype.jpg | bin | 0 -> 183372 bytes | |||
-rw-r--r-- | ma/safety_reset.bib | 2 | ||||
-rw-r--r-- | ma/safety_reset.tex | 259 |
5 files changed, 146 insertions, 115 deletions
diff --git a/ma/resources/grid_meas_device_front.jpg b/ma/resources/grid_meas_device_front.jpg Binary files differnew file mode 100755 index 0000000..0f03bd9 --- /dev/null +++ b/ma/resources/grid_meas_device_front.jpg diff --git a/ma/resources/grid_meas_device_open.jpg b/ma/resources/grid_meas_device_open.jpg Binary files differnew file mode 100755 index 0000000..4f01618 --- /dev/null +++ b/ma/resources/grid_meas_device_open.jpg diff --git a/ma/resources/prototype.jpg b/ma/resources/prototype.jpg Binary files differnew file mode 100755 index 0000000..cb67b8f --- /dev/null +++ b/ma/resources/prototype.jpg diff --git a/ma/safety_reset.bib b/ma/safety_reset.bib index 0302159..c6e3df4 100644 --- a/ma/safety_reset.bib +++ b/ma/safety_reset.bib @@ -1630,7 +1630,7 @@ @Online{eevblog01,
author = {Dave Jones},
date = {2013-01-08},
- title = {EEVblog #409 - EDMI - Smart Meter Teardown},
+ title = {EEVblog 409 - EDMI - Smart Meter Teardown},
url = {https://www.youtube.com/watch?v=dm-yZ1N3xmc},
urldate = {2020-06-03},
}
diff --git a/ma/safety_reset.tex b/ma/safety_reset.tex index ea11ff9..dc5bbad 100644 --- a/ma/safety_reset.tex +++ b/ma/safety_reset.tex @@ -119,7 +119,7 @@ %FIXME: sprinkle this section with citations. In the power grid as in other engineered systems we can observe an ongoing diffusion of information systems into -industrial control systems. Automation of these control systems has been practised for the better part of a century +industrial control systems. Automation of these control systems has been practiced for the better part of a century already. Throughout the 20th century this automation was mostly limited to core components of the grid. Generators in power stations are computer-controlled according to electromechanical and economic models. Switching in substations is automated to allow for fast failure recovery. Human operators are still vital to these systems, but their tasks have @@ -143,7 +143,7 @@ shift from a purely passive role to being active participants of the electricity To match this new landscape of decentralized generation and unpredictable renewable resources the utility industry has had to adapt itself in major ways. One aspect of this adaption that is particularly visible to ordinary people is the computerization of end-user energy metering. Despite the widespread use of industrial control systems inside the -electrical grid and the far-reaching diffusion of computers into people's everyday lifes the energy meter has long been +electrical grid and the far-reaching diffusion of computers into people's everyday lives the energy meter has long been one of the last remnants of an offline, analog time. Until the 2010s many households were still served through electromechanical Ferraris-style meters that have their origin in the late 19th century\cite{borlase01,ukgov04,bnetza02}. Today under the umbrella term \emph{Smart Metering} the shift towards fully @@ -152,7 +152,7 @@ smooth overall with some countries severely lagging behind other countries. As a metering technology is usually standardized on a per-country basis. This leads to an inhomogenous landscape with in some instances wildly incompatible systems. Often vendors only serve a single country or have separate models of a meter for each country. This complex standardization landscape and market situation has led to a proliferation of highly complex, -custom-coded microcontroller firwmare. The complexity and scale of this often network-connected firmware makes for a +custom-coded microcontroller firmware. The complexity and scale of this often network-connected firmware makes for a ripe substrate for bugs to surface. A remotely exploitable flaw inside a smart meter's firmware\footnote{ @@ -178,7 +178,7 @@ components of the smart meter that are threatened by remote compromise are equip \emph{safety reset controller} that listens for a reset command transmitted through the electrical grid's frequency and on reception forcibly resets the smart meter's entire firmware to a known-good state. Our safety reset controller receives commands through Direct Sequence Spread Spectrum (DSSS) modulation carried out on grid frequency through a -large controllable load such as an aluminium smelter. After forward error correction and cryptographic verification it +large controllable load such as an aluminum smelter. After forward error correction and cryptographic verification it re-flashes the meter's main microcontroller over the standard JTAG interface. In this thesis, starting from a high level architecture we have carried out extensive simulations of our proposal's @@ -195,7 +195,7 @@ modern power grids. \subsection{Structure of the electrical grid} -The electical grid is composed of a large number of systems such as distribution systems, power stations and substations +The electrical grid is composed of a large number of systems such as distribution systems, power stations and substations interconnected by long transmission lines. Mostly due to ohmic losses\footnote{ Power dissipation of a resistor of resistance $R [\Omega]$ given current $I [A]$ is $P_\text{loss} [W] = U_\text{drop} \cdot I = I^2 \cdot R$. Fixing power $P_\text{transmitted} [W] = U_\text{line} \cdot I$ this yields a @@ -208,7 +208,7 @@ the efficiency of transmission of electricity through long transmission lines in voltage\cite{crastan01,simon01}. % simon01: p. 425, 9.4.1.1, crastan p.55, 3.1 In practice economic considerations take into account a reduction of the considerable transmission losses (about \SI{6}{\percent} in case of Germany\cite{destatis01}) as well as the cost of equipment such as additional transformers -and the cost increase for the increased volatage rating of components such as transmission lines. Overall these +and the cost increase for the increased voltage rating of components such as transmission lines. Overall these considerations have led to a hierarchical structure where large amounts of energy are transmitted over very long distances (up to thousands of kilometers) at very high voltages (upwards of \SI{200}{\kilo\volt}) and voltages get lower the closer one gets to end-customer premises. In Germany at the local level a substation will distribute @@ -248,12 +248,12 @@ only\cite{crastan03}. \subsubsection{Generators} Traditionally all generators in the power grid were synchronous machines. A synchronous machine is a generator whose -copper coils are wound and connected in such a way that during normal operation its rotation is synchonous with the grid -frequency. Grid frequency and generator rotation speed are bidirectionally electromechanically coupled. If a generator's -angle of rotation would lag behind the grid it would receive electrical energy from the grid and convert it into -mechanical energy, acting as a motor--When the machine leads it acts as a generator and is braked. Small deviations -between rotational speed and grid frequency will be absorbed by the electromechanical coupling between both. Maintaining -optimal synchronization over time is the task of complex control systems inside power stations' speed +copper coils are wound and connected in such a way that during normal operation its rotation is synchronous with the +grid frequency. Grid frequency and generator rotation speed are bidirectionally electromechanically coupled. If a +generator's angle of rotation would lag behind the grid it would receive electrical energy from the grid and convert it +into mechanical energy, acting as a motor--When the machine leads it acts as a generator and is braked. Small +deviations between rotational speed and grid frequency will be absorbed by the electromechanical coupling between both. +Maintaining optimal synchronization over time is the task of complex control systems inside power stations' speed governors\cite{simon01,crastan01}. Nowadays besides traditional rotating generators the grid also contains a large amount of electronically controlled @@ -318,7 +318,7 @@ purposes. A frequent use is as a series inductor on one of the phases or the neu currents. In addition to this inductors are also used to tune LC circuits. One such use are Petersen coils, large inductors in series with the earth connection at a transformer's star point that are used to quickly extinguish arcs between phase and ground on a transmission line. The Petersen coil forms a parrallel LC resonant circuit with the -transmission line's earth capacitance. Tuning this circuit through adjusting the petersen coil reduces earth fault +transmission line's earth capacitance. Tuning this circuit through adjusting the Petersen coil reduces earth fault current to a level low enough to quickly extinguish the arc\cite{simon01}. \subsubsection{Power factor correction} @@ -338,7 +338,7 @@ difference between the resulting current waveform and the mains voltage waveform the inductive ballasts in old fluorescent lighting fixtures. The second potential issue are loads with a non-sinusoidal current waveform. There are many classes of these but the -most common one are the switching-mode power supplies (SMPS) used in most modern electronic devicese.. Most SMPS have an +most common one are the switching-mode power supplies (SMPS) used in most modern electronic devices.. Most SMPS have an input stage consisting of a bridge rectifier followed by a capacitor that provide high-voltage DC power to the following switch-mode convert circuit. This rectifier-capacitor input stage under normal load draws a high current only at the very peak of the input voltage sinusoid and draws almost zero current for most of the period. @@ -356,7 +356,7 @@ fast-acting reactive power compensation devices whose purpose is to maintain a c \subsubsection{Loads} Lastly, there is the loads that the electrical grid serves. Loads range from mains-powered indicator lights in devices -such as light switches or power strips weighing in at mere milliwatts to large smelters in industrial metal production +such as light switches or power strips weighing in at mere Milliwatts to large smelters in industrial metal production that can consume a fraction of a gigawatt all on their own. \subsection{Operational concerns} @@ -389,7 +389,7 @@ parameters to increase overall efficiency. \section{Smart meter technology} Smart meters were a concept pushed by utility companies throughout the early 21st century. Smart metering is one component of the -larger societal shift towards digitally interconnected technology. Old analog meters required that service pesonnel +larger societal shift towards digitally interconnected technology. Old analog meters required that service personnel physically come to read the meter. \emph{Smart} meters automatically transmit their readings through modern technologies. Utility companies were very interested in this move not only because of the cost savings for meter reading personnel: An always-connected meter also allows several entirely new use cases that have not been possible before. One @@ -399,15 +399,16 @@ over time but adapts to market conditions. Models such as prepayment electricity automatically disconnected until they pay their bill are significantly aided by a fully electronic system that can be controlled and monitored remotely\cite{anderson02}. A remotely controllable disconnect switch can also be used to coerce customers in situations where that was not previously economically possible\footnote{ - The swiss association of electrical utility companies in Section 7.2 Paragraph (2)a of their 2010 whitepaper on the + The Swiss association of electrical utility companies in Section 7.2 Paragraph (2)a of their 2010 white paper on the introduction of smart metering\cite{vseaes01} cynically writes that remotely controllable disconnect switches ``lead - a new tenant to swiftly register'' with the utility company. This whitepaper completely vanished from their website + a new tenant to swiftly register'' with the utility company. This white paper completely vanished from their website some time after publication, but the internet archive has a copy. }. Figure \ref{fig_smgw_schema} shows a schema of a smart metering installation in a typical household\cite{stuber01}. \begin{figure} \centering - \includegraphics{resources/smgw_usage_scenario} + \includegraphics[width=\textwidth]{resources/smgw_usage_scenario} + \vspace*{1cm} \caption{A typical usage scenario of a smart metering system in a typical home. This diagram shows a gateway connected to multiple smart meters through its local metrological network (LMN) and a multitude of devices on the customer's home area network (HAN). A solar inverter and an electric car are connected through a controllable local @@ -421,18 +422,18 @@ electricity prices adapting to the market situation along with this convenience electricity and to consume it in a way that is more amenable to utilities, both leading to reduced cost\cite{borlase01,bmwi03,anderson02}. -Traditional Ferraris counters with their distinctive rotating aluminium disc are simple electromechanical devices. Since +Traditional Ferraris counters with their distinctive rotating aluminum disc are simple electromechanical devices. Since they do not include any semiconductors or other high technology that might be prone to failure a cheap Ferraris-style meter can last decades. In contrast to this, smart meters are complex high technology. They are vastly more expensive to develop in the first place since they require the development and integration of large amounts of complex, custom -firwmare. Once deployed, their lifetime is limited by this complexity. Complex semiconductor devices tend to fail, and +firmware. Once deployed, their lifetime is limited by this complexity. Complex semiconductor devices tend to fail, and firmware that needs to communicate with the outside world tends to not age well\cite{borkar01}. This combination of higher unit cost and lower expected lifetime leads to increased costs per household. This cost is usually shared between utility and customer. As part of its smart metering rollout the German government in 2013 had a study conducted on the economies of smart meter installations. This study came to the conclusion that for the majority of households computerizing an existing -ferraris meter is uneconomical. For larger consumers or new installations the higher cost of installation over time is +Ferraris meter is uneconomical. For larger consumers or new installations the higher cost of installation over time is expected to be offset by the resulting savings in electricity cost\cite{bmwi03}. \subsection{Smart metering and Human-Computer Interaction} @@ -462,9 +463,9 @@ As \cite{pierce01} pointed out smart metering developments could benefit greatly HCI research certainly would not have overlooked entire central issues such as privacy as it happened in the dutch case\cite{cuijpers01}. The current corporate-driven approach to a technological advance forced through national standardization bears a risk of failing to meet its ostensible objectives for consumers. The role of consumers and the -complex sociotechnological environment posed by this new technology is not seriously considered in the standardization +complex socio-technological environment posed by this new technology is not seriously considered in the standardization process. While certainly no one will admit to outright ignoring consumers in smart meter standardization, their role is -largely limited to the occassional public consultation. At the same time the standards are written by technologists--it +largely limited to the occasional public consultation. At the same time the standards are written by technologists--it seems largely without input on their practicality or socio-technological implications from fields such as HCI. % TODO citation? too much burn? @@ -478,7 +479,7 @@ demonstration setup). Specialized SoCs usually contain a segment LCD driver alo analog-to-digital converters for the actual measurement functions. In many smart meter designs the metering SoC is connected to another full-featured SoC acting as the modem. At a casual glance this might seem to be a security measure, but it is be more likely that this is done to ease integration of one metering platform with several different -communication stacks (e.g.\ proprietary sub-gigahertz wireless, powerline communication (PLC) or ethernet). In these +communication stacks (e.g.\ proprietary sub-gigahertz wireless, power line communication (PLC) or Ethernet). In these architectures there is a clear line of functional demarcation between the metering SoC and the modem. As evidenced by over-the-air software update functionality (see e.g.\ \cite{honeywell01}) this does not however extend to an actual security boundary. @@ -593,7 +594,7 @@ properties (e.g.\ lack of forward-secrecy)\cite{khurana01,sato01}. In this section we will give an overview of the situation in a number of countries. This list of countries is not representative and notably does not include any developing countries and is geographically biased. We selected these countries for illustration only and based our selection in a large part on the availability of information in a language -we can read. We will conclude this section with a summarization of common themes. +we can read. We will conclude this section with a summary of common themes. \subsubsection{Germany} @@ -656,7 +657,7 @@ recently privatized. While the wholesale market and transmission network privati retail customers continued to use the incumbent distribution system operator ENEL as their supplier\cite{ec03}. This dominant position allowed ENEL to orchestrate the large-scale rollout of smart meters in Italy. Almost every meter in Italy had been replaced by a smart meter by 2018\cite{ec03}. An unique feature of the Italian smart metering -infrastructure is that it relies on Powerline Communication (PLC) to bridge distances between meters and cellular radio +infrastructure is that it relies on Power Line Communication (PLC) to bridge distances between meters and cellular radio gateways\cite{gungor01}. \subsubsection{Japan} @@ -673,7 +674,7 @@ A unique point in the Japanese utility metering landscape is that the current pr Japan residential utility meters are usually mounted outside the building on an exterior wall and every month someone with a mirror on a long stick will come and read the meter. The meter reader then makes a thermal paper print-out of the updated utility bill and puts it into the resident's post box. This practice gives consumers good control over their -consumption but does incur significant pesonnel overhead. % TODO decide on citation. Maybe the toshiba one? +consumption but does incur significant personnel overhead. % TODO decide on citation. Maybe the toshiba one? \subsubsection{The USA} @@ -802,7 +803,7 @@ are used both for highly-granular load measurement and (in some countries) load Smart electricity meters are effectively consumer devices. They are built down to a certain price point that is measured by the burden it puts on consumers. The cost of a smart meter is ultimately limited by it being a major factor in the economies of a smart meter rollout\cite{bmwi03}. Cost requirements preclude some hardware features such as the use of a -standard hardened software environment on a high-powerded embedded system (such as a hypervirtualized embedded linux +standard hardened software environment on a high powered embedded system (such as a hypervirtualized embedded linux setup) that would both increase resilience against attacks and simplify updates. Combined with the small market sizes in smart grid deployments\footnote{ Most vendors of smart electricity meters only serve a handful of markets. For the most part, smart meter development @@ -823,7 +824,7 @@ systems. A spectacular example of this difficulty is the recently-exposed flaw i bootloader\footnote{ Modern system-on-chips integrate one or several CPUs with a multitude of peripherals, from memory and DMA controllers over 3D graphics accelerators down to general-purpose IO modules for controlling things like indicator - LEDs. Most SoCs boot from one of several boot devices such as flash memory, ethernet or USB according to a + LEDs. Most SoCs boot from one of several boot devices such as flash memory, Ethernet or USB according to a configuration set by pin-strapping configuration IOs or through write-only fuse bits. Physically, one of the processing cores of the SoC (usually one of the main CPU cores) is connected such that it is @@ -834,7 +835,7 @@ bootloader\footnote{ Apple's ROM loader measures only a few hundred bytes. It performs authorization checks to ensure only software authorized by Apple is booted. The present flaw allows an attacker to circumvent these checks and boot their own - code on a USB-connected iPhone. This compromsies Apple's chain of trust from ROM loader to userland right at its + code on a USB-connected iPhone. This compromises Apple's chain of trust from ROM loader to userland right at its root. Since this is a flaw in the factory-programmed first stage read-only boot code of the SoC it cannot be patched in the field. }, that allows a full compromise of any iPhone before the iPhone X. iPhone 8, one of the affected models, was still @@ -853,7 +854,7 @@ they have trouble securing their secure embedded software stacks, what is a smar Since thorough formal verification of code is not yet within reach for either large-scale software development or code heavy in side-effects such as embedded firmware or industrial control software\cite{pariente01} the two most effective -measures for embedded security are reducing the amount of code on one hand, and labour-intensively reviewing and testing +measures for embedded security are reducing the amount of code on one hand, and labor-intensively reviewing and testing this code on the other hand. A smart meter manufacturer does not have a say in the former since it is bound by the official regulations it has to comply with, and will likely not have sufficient resources for the latter. We are left with an impasse: Manufacturers in this field likely do not have the security resources to keep up with complex standards @@ -902,7 +903,7 @@ damage\cite{lee01}. Despite their potentially large impact, these attacks are only moderately interesting from a scientific perspective. For one, their mitigation mostly consists of a straightforward application of decades-old security best practices. Though -there is room for the implementation of genuinely new, power sytems-specific security systems in this field, the general +there is room for the implementation of genuinely new, power systems-specific security systems in this field, the general state of the art is lacking behind other fields of embedded security. From this background low-hanging fruit should take priority\cite{heise02}. Given political will these systems can readily be fortified. There is only a comparatively small number of them and having a technician drive to every one of them in turn to install a firmware security update is @@ -913,8 +914,8 @@ feasible. Control function exploits are attacks on the mathematical control loops used by the centralized control system. One example of this type of attack are resonance attacks as described in \cite{wu01}. In this kind of attack, inputs from peripheral sensors indicating grid load to the centralized control system are carefully modified to cause a -disproportionally large oscillation in control system action. This type of attack relies on complex resonance effects -that arise when mechanical generators are electrically coupled. These resonances, coloquially called ``modes'', are +disproportionately large oscillation in control system action. This type of attack relies on complex resonance effects +that arise when mechanical generators are electrically coupled. These resonances, colloquially called ``modes'', are well-studied in power system engineering\cite{rogers01,grebe01,entsoe01,crastan03}. Even disregarding modern attack scenarios, for stability electrical grids are designed with measures in place to dampen any resonances inherent to grid structure. These resonances are hard to analyze since they require an accurate grid model and they are unlikely to be @@ -1008,7 +1009,7 @@ control structure of yesterday's ``dumb'' grid and the advent of centralization centralized infrastructure has been carefully designed to defend against malicious actors and all involved parties have an interest in keeping it secure but in centralized systems scaling attacks is inherently easier than in decentralized systems\cite{anderson02}. An attacker can employ centralized control to their advantage. From this perspective the -centralization of smart metering control sytems--sometimes up to a national level\cite{anderson01,anderson02}--poses a +centralization of smart metering control systems--sometimes up to a national level\cite{anderson01,anderson02}--poses a security risk. \chapter{Restoring endpoint safety in an age of smart devices} @@ -1071,7 +1072,7 @@ to remain secure even if we assume some number of physically compromised devices \subsection{Attack characteristics} The attacker model the two above conditions must hold under is as follows. We assume three angles of attack: Attacks by the customer themselves, attacks by an insider within the metering systems controlling utility company and lastly attacks -from third parties. Examples for these third parties are hobbyist hackers or outside cyber-criminals on the one hand, +from third parties. Examples for these third parties are hobbyist hackers or outside cybercriminals on the one hand, but also other companies participating in the smart grid infrastructure besides the utility company such as intermediary providers of meter-reading services. @@ -1081,7 +1082,7 @@ expected to be able to assume either of the other two roles as well e.g. through generalized attacker model in \cite{fraunholz01} the authors give a classification of attacker types and provide a nice taxonomy of attacker properties. In their threat/capability rating, criminals are still considered to have higher threat rating than state-sponsored attackers. The New York Times reported in 2016 that some states recruit their hacking -personnel in part from cyber-criminals. If this report is true, in a worst-case scenario we have to assume a +personnel in part from cybercriminals. If this report is true, in a worst-case scenario we have to assume a state-sponsored attacker to be the worst of both types. Comparing this against the other attacker types in \cite{fraunholz01}, this state-sponsored attacker is strictly worse than any other type in both variables. We are left with a highly-skilled, very well-funded, highly intentional and motivated attacker. @@ -1125,7 +1126,7 @@ model attacks on the electrical grid components between the reset authority and of service attacks. To ensure the \emph{safety} criterion from Section \ref{sec_criteria} holds we must make sure our cryptography is secure against man-in-the-middle attacks and we must try to harden the system against denial-of-service attacks by the attacker types listed above. Given our attacker model we cannot fully guard against this sort of attack -but we can at least choose a commmunication channel that is resilient under the above model. +but we can at least choose a communication channel that is resilient under the above model. Finally, we have to consider the issue of hardware security. We will solve the problem of physical attacks by simply not programming any secret information into devices. This also simplifies hardware production. We consider supply-chain @@ -1154,10 +1155,11 @@ have started appearing in general-purpose microcontrollers, most still lack even for computers or smartphones. One of the components lacking from most microcontrollers is strong memory protection or even a memory mapping unit as it -is found in all modern computer processors and SoCs for applications such as smartphones. Without an MPU or MMU many -memory safety mitigations cannot be implemented. This and the absence of virtualization tools such as ARM's TrustZone -make hardening microcontroller firmware a big task. It is very important to ensure memory safety in microcontroller -firmware through tools such as defensive coding, extensive testing and formal verification. +is found in all modern computer processors and SoCs for applications such as smartphones. Without an MPU (Memory +Protection Unit) or MMU (Memory Management Unit) many memory safety mitigations cannot be implemented. This and the +absence of virtualization tools such as ARM's TrustZone make hardening microcontroller firmware a big task. It is very +important to ensure memory safety in microcontroller firmware through tools such as defensive coding, extensive testing +and formal verification. In our design we achieve simplicity on two levels: One, we isolate the very complex metering firmware from our reset controller by having both run on separate microcontrollers. Two, we keep the reset controller firmware itself extremely @@ -1221,23 +1223,23 @@ interfaces established in energy metering applications and evaluate each of them There is a number of well-established technologies for communication on or along power lines. We can distinguish three basic system categories: Systems using separate wires (such as DSL over landline telephone wiring), wireless radio -systems (such as LTE) and \emph{powerline communication} (PLC) systems that reüse the existing mains wiring and +systems (such as LTE) and \emph{power line communication} (PLC) systems that reüse the existing mains wiring and superimpose data transmissions onto the 50 Hz mains sine\cite{gungor01,kabalci01}. For our scenario, we will ignore short-range communication systems. There exists a large number of \emph{wideband} -powerline communication systems that are popular with consumers for bridging ethernet segments between parts of an +power line communication systems that are popular with consumers for bridging Ethernet segments between parts of an apartment or house. These systems transmit up to several hundred megabits per second over distances up to several tens of meters\cite{kabalci01}. Technologically, these wideband PLC systems are very different from \emph{narrowband} systems used by utilities for load management among other applications and they are not relevant to our analysis. -\subsection{Powerline communication (PLC) systems and their use} +\subsection{Power line communication (PLC) systems and their use} In long-distance communications for applications such as load management, PLC systems are attractive since they allow re-using the existing wiring infrastructure and have been used as early as in the 1930s\cite{hovi01}. Narrowband PLC systems are a potentially low-cost solution to the problem of transmitting data at small bandwidth over distances of several hundred meters up to tens of kilometers. -Narrowband PLC systems transmit on the order of kilobits per second or slower. A common use of this sort of system are +Narrowband PLC systems transmit on the order of Kilobits per second or slower. A common use of this sort of system are \emph{ripple control} systems. These systems superimpose a low-frequency signal at some few hundred Hertz carrier frequency on top of the 50Hz mains sine. This low-frequency signal is used to encode switching commands for non-essential residential or industrial loads. Ripple control systems provide utilities with the ability to actively @@ -1250,14 +1252,14 @@ reading (AMR) in places such as Italy or France require repeaters within a few h \subsection{Landline and wireless IP-based systems} -Especially in automated meter reading (AMR) infrastructure the cost-benefit trade-off of powerline systems does not +Especially in automated meter reading (AMR) infrastructure the cost-benefit trade-off of power line systems does not always work out for utilities. A common alternative in these systems is to use the public internet for communication. Using the public internet has the advantage of low initial investment on the part of the utility company as well as quick commissioning. Disadvantages compared to a PLC system are potentially higher operational costs due to recurring fees to network providers as well as lower reliability. Being integrated into power grid infrastructure, a PLC system's failure modes are highly correlated with the overall grid. Put briefly, if the PLC interface is down, there is a good chance that power is out, too. In contrast general internet services exhibit a multitude of failures that are entirely -decorrelated from power grid stability. For purposes such as meter reading for billing purposes, this stability is +uncorrelated to power grid stability. For purposes such as meter reading for billing purposes, this stability is sufficient. However for systems that need to hold up in crisis situations such as the recovery system we are contemplating in this thesis, the public internet may not provide sufficient reliability. @@ -1273,7 +1275,7 @@ certification. It can be customized to its specific application. In addition it sharing infrastructure such as a cellular radio gateway between multiple devices. In other fields a lack of standardization has led to a proliferation of proprietary protocols and a fragmented protocol landscape. This is a large problem since the consumer cannot easily integrate products made by different manufacturers into one system. In advanced -metering infrastructure this is unlikely to be a disadvantage since ususally there is only one distribution grid +metering infrastructure this is unlikely to be a disadvantage since usually there is only one distribution grid operator for an area. Shared resources such as a cellular radio gateway would most likely only be shared within a single building and usually they are all operated by the same provider. @@ -1304,7 +1306,7 @@ been considered for large-scale application. Advantages of using grid frequency for communication are low receiver hardware complexity as well as the fact that a single transmitter can cover an entire synchronous area. Though the transmitter has to be very large and powerful the setup of a single large transmitter faces lower bureaucratic hurdles than integration of hundreds of smaller ones into -hundreds of local systems that each have autonomous goverance. +hundreds of local systems that each have autonomous governance. \subsubsection{The frequency dependency of grid frequency} @@ -1334,7 +1336,7 @@ Europe synchronous area. The ENTSO-E Operations Handbook Policy 1 chapter\cite{entsoe02} defines the activation threshold of primary control to be \SI{20}{\milli\hertz}. Ideally, a modulation system would stay well below this threshold to avoid fighting the -primary control reserve. Modulation line rate should likely be on the order of a few hundred millibaud. Modulation at +primary control reserve. Modulation line rate should likely be on the order of a few hundred Millibaud. Modulation at these rates would outpace primary control action which is specified by ENTSO-E as acting within between ``a few seconds'' and \SI{15}{\second}. @@ -1347,33 +1349,33 @@ ENTSO-E at around \SI{20}{\giga\watt\per\hertz}. This works out to an upper bo In its most basic form a transmitter for grid frequency modulation would be a very large controllable load connected to the power grid at a suitable vantage point. A spool of wire submerged in a body of cooling liquid such as a small lake -along with a thyristor rectifier bank would likely suffice to perform this function during occassional cybersecurity +along with a thyristor rectifier bank would likely suffice to perform this function during occasional cybersecurity incidents. We can however decrease hardware and maintenance investment even further compared to this rather uncultivated solution by repurposing regular large industrial loads as transmitters in an emergency situation. For some preliminary exploration we went through a list of energy-intensive industries in Europe\cite{ec01}. The most -electricity-intensive industries in this list are primary aluminium and steel production. In primary production raw ore +electricity-intensive industries in this list are primary aluminum and steel production. In primary production raw ore is converted into raw metal for further refinement such as casting, rolling or extrusion. In steelmaking iron is -smolten in an electric arc furnace. In aluminium smelting aluminium is electrolytically extracted from alumina. Both +smolten in an electric arc furnace. In aluminum smelting aluminum is electrolytically extracted from alumina. Both processes involve large amounts of electricity with electricity making up \SI{40}{\percent} of production costs. Given -these circumstances a steel mill or aluminium smelter would be good candidates as transmitters in a grid frequency +these circumstances a steel mill or aluminum smelter would be good candidates as transmitters in a grid frequency modulation system. -In aluminium smelting high-voltage mains is transformed, rectified and fed into about 100 series-connected electrolytic +In aluminum smelting high-voltage mains is transformed, rectified and fed into about 100 series-connected electrolytic cells forming a \emph{potline}. Inside these pots alumina is dissolved in molten cryolite electrolyte at about \SI{1000}{\degreeCelsius} and electrolysis is performed using a current of tens or hundreds of Kiloampère. The resulting -pure aluminium settles at the bottom of the cell and is tapped off for further processing. +pure aluminum settles at the bottom of the cell and is tapped off for further processing. -Like steelworks, aluminium smelters are operated night and day without interruption. Aside from metallurgical issues the +Like steelworks, aluminum smelters are operated night and day without interruption. Aside from metallurgical issues the large thermal mass and enormous heating power requirements do not permit power cycling. Due to the high costs of -production inefficiencies or interruptions the behavior of aluminium smelters under power outages is a +production inefficiencies or interruptions the behavior of aluminum smelters under power outages is a well-characterized phenomenon in the industry. The recent move away from nuclear power and towards renewable energy has lead to an increase in fluctuations of electricity price throughout the day. These electricity price fluctuations have -provided enough economic incentive to aluminium smelters to develop techniques to modulate smelter power consumption +provided enough economic incentive to aluminum smelters to develop techniques to modulate smelter power consumption without affecting cell lifetime or product quality\cite{duessel01,eisma01}. Power outages of tens of minutes up to two -hours reportedly do not cause problems in aluminium potlines and are in fact part of routine operation for purposes such +hours reportedly do not cause problems in aluminum potlines and are in fact part of routine operation for purposes such as electrode changes\cite{eisma01,oye01}. -The power supply system of an aluminium plant is managed through a highly-integrated control system as keeping all cells +The power supply system of an aluminum plant is managed through a highly-integrated control system as keeping all cells of a potline under optimal operating conditions is challenging. Modern power supply systems employ large banks of diodes or SCRs\footnote{SCRs, also called thyristors, are electronic devices that are often used in high-power switching applications. They are normally-off devices that act like diodes when a current is fed into their control terminal.} to @@ -1382,10 +1384,10 @@ continuously through a combination of a tap changer and a transductor. The indiv changing the anode to cathode distance (ACD) by physically lowering or raising the anode. The potline power supply is connected to the high voltage input and to the potline through isolators and breakers. -In an aluminium smelter most of the power is sunk into resistive losses and the electrolysis process. As such an -aluminium smelter does not have any significant electromechanical inertia compared to the large rotating machines used +In an aluminum smelter most of the power is sunk into resistive losses and the electrolysis process. As such an +aluminum smelter does not have any significant electromechanical inertia compared to the large rotating machines used in other industries. Depending on the capabilities of the rectifier controls high slew rates are possible, permitting -modulation at high\footnote{Aluminium smelter rectifiers are \emph{pulse rectifiers}. This means instead of simply +modulation at high\footnote{Aluminum smelter rectifiers are \emph{pulse rectifiers}. This means instead of simply rectifying the incoming three-phase voltage they use a special configuration of transformer secondaries and in some cases additional coils to produce a large number of equally spaced phases (e.g.\ six) from a standard three-phase input. Where a direct-connected three-phase rectifier would draw current in six pulses per mains voltage cycle a pulse @@ -1407,8 +1409,8 @@ particular frequency. \cite{kundur01} separates these modes into four categorie \begin{description} \item[Local modes] where a single power station oscillates in some parameter, - \item[Interarea modes] where subsections of the overall grid oscillate w.r.t.\ each other due to weak coupling - between them, + \item[Interarea modes] where subsections of the overall grid oscillate with respect to each other due to weak + coupling between them, \item[Control modes] caused by imperfectly tuned control systems and \item[Torsional modes] that originate from electromechanical oscillations in the generator itself. \end{description} @@ -1432,9 +1434,9 @@ controllable load: \item[Modulation amplitude.] Amplitude is proportionally related to modulation power. In a practical setup we might realize a modulation power up to a few hundred \si{\mega\watt} which would yield a few tens of \si{\milli\hertz} of frequency amplitude. - \item[Modulation pre-emphasis and slew-rate control.] Pre-emphasis might be necessary to ensure an adequate + \item[Modulation preemphasis and slew-rate control.] Preemphasis might be necessary to ensure an adequate Signal-to-Noise ratio (SNR) at the receiver. Slew-rate control and other shaping measures might be necessary to - reduce the impact of these sudden load changes on the transmitter's primary function (say, aluminium smelting) + reduce the impact of these sudden load changes on the transmitter's primary function (say, aluminum smelting) and to prevent disturbances to other grid components. \item[Modulation frequency.] For a practical implementation a careful study would be necessary to determine the optimal frequency band for operation. On one hand we need to prevent disturbances to the grid such as the @@ -1486,7 +1488,7 @@ Spread spectrum covers a whole family of techniques that are comprehensively exp In \cite{goiser01} a BPSK or similar modulation is assumed underlying the spread-spectrum technique. Our grid frequency modulation channel effectively behaves more like a DC-coupled wire than a traditional radio channel: Any change in -excitation will cause a proportional change in the receiver's measurement. Using our fft-based measurement methodology +excitation will cause a proportional change in the receiver's measurement. Using our FFT-based measurement methodology we get a real-valued signed quantity. In this way grid frequency modulation is similar to a channel using coherent modulation. We can utilize both signal strength and polarity in our modulation. @@ -1504,7 +1506,7 @@ works by directly modulating a long pseudo-random bit sequence onto the channel. pseudo-random bit sequence and continuously calculates the correlation between the received signal and the pseudo-random template sequence mapped from binary $[0, 1]$ to bipolar $[1, -1]$. The pseudo-random sequence has an approximately equal number of $0$ and $1$ bits. The positive contribution of the $+1$ terms of the correlation template approximately cancel -out with the $-1$ terms when multiplied with an uncorrelated signal such as white gaussian noise. +out with the $-1$ terms when multiplied with an uncorrelated signal such as white Gaussian noise. By using a family of pseudo-random sequences with low cross-correlation channel capacity can be increased. Either the transmitter can encode data in the choice of sequence or multiple transmitters can use the same channel at once. The @@ -1534,7 +1536,7 @@ transmission length so for our relatively long transmissions we would realistica Error correcting codes are a very broad field with many options for specialization. Since we are implementing only an advanced prototype in this thesis we chose to spend only limited resources on optimization and settled on a basic -reed-solomon code. We have no doubt that applying a more state-of-the-art code we could gain further improvements in +Reed-Solomon code. We have no doubt that applying a more state-of-the-art code we could gain further improvements in code overhead and decoding speed among others\cite{mackay01}. Since message length in our system limits system response time but we do not have a fixed target we can tolerate some degree of overhead. Decoding speed is of very low concern to us because our data rate is extremely low. We derived our implementation by adapting and optimizing an existing open @@ -1595,7 +1597,7 @@ instruction to perform a safety reset. This is the only message we might ever wa only one element. The information content of our message thus is 0 bit! All the information we want to transmit is already encoded \emph{in the fact that we are transmitting} and we do not require a further payload to be transmitted: We can omit the entirety of the message and just transmit whatever ``signature'' we produce. This is useful to conserve -transmission bits so our transmission does not take an exceeedingly long time over our extremely slow communication +transmission bits so our transmission does not take an exceedingly long time over our extremely slow communication channel. We can modify this construction to allow for a small number of bits of information content in our message (say two or @@ -1616,7 +1618,7 @@ A possible scenario would be that an attacker first causes enough havoc for auth attacker would record the trigger transmission. We can assume most meters were reset during the attack. Due to this the attacker cannot cause a significant number of additional resets immediately afterwards. However, the attacker could wait several years for a number of new meters to be installed that might not yet have updated firmware that includes the -lastest transmission. This means the attacker could cause them to reset by replaying the original sequence. +last transmission. This means the attacker could cause them to reset by replaying the original sequence. % TODO mention why firmware has to be update with last transmission A possible mitigation for this risk would be to introduce one bit of information into the trigger message that is @@ -1659,7 +1661,7 @@ entry of $P$ secret. k_{H(m)_i}$ of $S$ correctly evaluate to $p_{b, i} = H\left(s_{b, i}\right)$ from $P$ under $H$. The above scheme is a one-time signature scheme only. After one signature has been published for a given key, the -corresponding key must not be reüsed for other signatures. This is intutively clear as we are effectively publishing +corresponding key must not be reüsed for other signatures. This is intuitively clear as we are effectively publishing part of the private key as the signature, and if we were to publish a signature for another message an attacker could derive additional signatures by ``mixing'' the two published signatures. @@ -1717,7 +1719,7 @@ signature two resets can still be triggered directly after one another. In practice it may be useful to have some control over which meters reset. An attack exploiting a particular network protocol implementation flaw might only affect one series of meters made by one manufacturer. Resetting \emph{all} -meters may be too much in this case. A simple solution for this is to define adressable subsets of meters. ``All +meters may be too much in this case. A simple solution for this is to define addressable subsets of meters. ``All meters'' along with ``meters made by manufacturer $x$'' and ``meters of model $y$'' are good choices for such scopes. On the cryptographic level the protocol state is simply duplicated for each scope. This incurs memory and computation overhead linear in the number of scopes but device memory requirements are small at a few bytes only and computation is @@ -1754,7 +1756,7 @@ particular application's requirements. Developing an universal solution is outsi \includegraphics{resources/transmitter_scope_key_illustration} \caption{ An illustration of a key management system using a common master key. First, the transmitter derives one secret - key for each adressable group from the master key. Then public device keys are generated like in Figure + key for each addressable group from the master key. Then public device keys are generated like in Figure \ref{fig:sig_key_chain}. Finally for each device the manufacturer picks the group public keys matching the device. In this example one device is a series A meter made by manufacturer B so it gets provisioned with the keys for the ``all devices'', ``manufacturer B'' and ``series A'' groups. The other device is also made by @@ -1805,7 +1807,7 @@ systems there is a large amount of academic research on such algorithms\cite{nar popular approach to these systems is to perform a Short-Time Fourier Transform (STFT) on ADC data sampled at high sampling rate (e.g. \SI{10}{\kilo\hertz}) and then perform analysis on the frequency-domain data to precisely locate the peak at \SI{50}{\hertz}. A key observation here is that FFT bin size is going to be much larger than required frequency -resolution. This fundamental limitiation follows from the Nyquist criterion %FIXME cite DSP text +resolution. This fundamental limitation follows from the Nyquist criterion %FIXME cite DSP text and if we had to process an \emph{arbitrary} signal this would severely limit our practical measurement accuracy \footnote{ Some software packages providing FFT or STFT primitives such as scipy\cite{virtanen01} allow the user to @@ -1819,7 +1821,7 @@ and if we had to process an \emph{arbitrary} signal this would severely limit ou For this reason all approaches to grid frequency estimation are based on a model of the voltage waveform. Nominally this waveform is a perfect sine at $f=\SI{50}{\hertz}$. In practice it is a sine at $f\approx\SI{50}{\hertz}$ superimposed with some aperiodic noise (e.g. irregular spikes from inductive loads being energized) as well as harmonic -distortion that is caused by topologically nearby devices with power factor $\cos \theta \neq 1.0$. Under a continous +distortion that is caused by topologically nearby devices with power factor $\cos \theta \neq 1.0$. Under a continuous fourier transform over a long period the frequency spectrum of a signal distorted like this will be a low noise floor depending mainly on aperiodic noise on which a comb of harmonics as well as some sub-harmonics of $f \approx f_\text{nom} = \SI{50}{\hertz}$ is riding. The main peak at $f \approx f_\text{nom}$ will be very strong with the @@ -1835,18 +1837,18 @@ use a general approach to estimate the precise fundamental frequency of an arbit experimental physicists Gasior and Gonzalez at CERN\cite{gasior01}. This approach assumes a general sinusoidal signal superimposed with harmonics and broadband noise. Applicable to a wide spectrum of practical signal analysis tasks it is a reasonable first-degree approximation of the much more sophisticated estimation algorithms developed specifically for -power systems. Some algorithms use components such as kalman filters\cite{narduzzi01} that require a phyiscal model. +power systems. Some algorithms use components such as kalman filters\cite{narduzzi01} that require a physical model. As a general algorithm \cite{gasior01} does not require this kind of application-specific tuning, eliminating one source of error. The Gasior and Gonzalez algorithm\cite{gasior01} passes the windowed input signal through a DFT, then interpolates the -signal's fundamental frequency by fitting a wavelet such as a gaussian to the largest peak in the DFT results. The bias +signal's fundamental frequency by fitting a wavelet such as a Gaussian to the largest peak in the DFT results. The bias parameter of this curve fit is an accurate estimation of the signal's fundamental frequency. This algorithm is similar to the simpler interpolated DFT algorithm used as a reference in much of the synchrophasor estimation -literature\cite{borkowski01}. The three-term variant of the maximum sidelobe decay window often used there is a blackman -window with parameter $\alpha = \frac{1}{4}$. Analysis has shown\cite{belega01} that the interpolated DFT algorithm is -worse than algorithms involving more complex models under some conditions but that there is \emph{no free lunch} meaning -that more complex perform worse when the input signal deviates from their models. +literature\cite{borkowski01}. The three-term variant of the maximum side lobe decay window often used there is a +Blackman window with parameter $\alpha = \frac{1}{4}$. Analysis has shown\cite{belega01} that the interpolated DFT +algorithm is worse than algorithms involving more complex models under some conditions but that there is \emph{no free +lunch} meaning that more complex perform worse when the input signal deviates from their models. \subsection{Frequency sensor hardware design} % FIXME: link to schematics in appendix @@ -1912,7 +1914,7 @@ with a printed label and a few status lights on its front. \subsection{Clock accuracy considerations} Our measurement hardware will sample line voltage at some sampling rate $f_S$, e.g.\ \SI{1}{\kilo\hertz}. All downstream -processsing is limited in accuracy by the accuracy of $f_S$\footnote{ +processing is limited in accuracy by the accuracy of $f_S$\footnote{ We are not considering the effect of clock jitter. We are highly oversampling the signal and the FFT done in our downstream processing will average out small jitter effects leaving only frequency stability to worry about. }. We generate our sampling clock in hardware by clocking the ADC from one of the microcontroller's timer blocks clocked from @@ -2000,16 +2002,35 @@ with IO contention on the Raspberry PI/Linux side causing only 16 skipped sample \subsection{Frequency sensor measurement results} -Captured raw waveform data has been processed in the Jupyter Lab environment\cite{kluyver01} and grid frequency -estimates are extracted as described in Section \ref{frequency_estimation} using the Gasior and Gonzalez\cite{gasior01} -technique. The Jupyter notebook we used for frequency measurement is included with the supplementary materials to this -thesis. In Figure \ref{freq_meas_feedback} we fed back to the frequency estimator its own output giving us an indication -of its numerical performance. The result was \SI{1.3}{\milli\hertz} of RMS noise over a \SI{3600}{\second} simulation -time. This indicates performance is good enough for our purposes. In addition to this we validated our algorithm's -performance by applying it to the test waveforms from \cite{wright01}. In this test we got errors of -\SI{4.4}{\milli\hertz} for the \emph{noise} test waveform, \SI{0.027}{\milli\hertz} for the \emph{interharmonics} test -waveform and \SI{46}{\milli\hertz} for the \emph{amplitude and phase step} test waveform. Full results can be found in -Figure \ref{freq_meas_rocof_reference}. +\begin{figure} + \centering + \begin{minipage}[c]{0.48\textwidth} + \includegraphics{resources/grid_meas_device_front.jpg} + \end{minipage} + \begin{minipage}[c]{0.48\textwidth} + \includegraphics{resources/grid_meas_device_open.jpg} + \end{minipage} + \vspace*{3mm} + \caption{ + The finished grid frequency sensor device. The large yellow part on the bottom left is the crystal oven. The + large black part is the power supply module. The microcontroller is on the bottom right of the device and the + measurement circuit is in its middle. The device connects to the data recording computer via galvanically + isolated USB on the bottom and to a regular wall socket through the IEC connector on the top of the device. + } + \label{pic_freq_sensor} +\end{figure} + +Our completed frequency sensor can be seen in Figure \ref{pic_freq_sensor}. The raw voltage waveform data we captured +with it has been processed in the Jupyter Lab environment\cite{kluyver01} and grid frequency estimates are extracted as +described in Section \ref{frequency_estimation} using the Gasior and Gonzalez\cite{gasior01} technique. The Jupyter +notebook we used for frequency measurement is included with the supplementary materials to this thesis. In Figure +\ref{freq_meas_feedback} we fed back to the frequency estimator its own output giving us an indication of its numerical +performance. The result was \SI{1.3}{\milli\hertz} of RMS noise over a \SI{3600}{\second} simulation time. This +indicates performance is good enough for our purposes. In addition to this we validated our algorithm's performance by +applying it to the test waveforms from \cite{wright01}. In this test we got errors of \SI{4.4}{\milli\hertz} for the +\emph{noise} test waveform, \SI{0.027}{\milli\hertz} for the \emph{interharmonics} test waveform and +\SI{46}{\milli\hertz} for the \emph{amplitude and phase step} test waveform. Full results can be found in Figure +\ref{freq_meas_rocof_reference}. Figures \ref{freq_meas_trace} and \ref{freq_meas_trace_mag} show our measurement results over a 24-hour and a 2-hour window respectively. @@ -2023,7 +2044,7 @@ window respectively. four graphs show a comparison of the original trace (blue) and the re-calculated trace (orange). The bottom trace shows the difference between the two. As we can tell both traces agree very well with an overall RMS deviation of about \SI{1.3}{\milli\hertz}. The bottom trace shows deviation growing over time. This is an effect - of numerical errors in our ad-hoc waveform generator. + of numerical errors in our ad hoc waveform generator. } \label{freq_meas_feedback} \end{figure} @@ -2067,7 +2088,7 @@ window respectively. \centering \includegraphics{../lab-windows/fig_out/mains_voltage_spectrum} \caption{Power spectral density of the mains voltage trace in Figure \ref{freq_meas_trace}. Data was captured using - our frequency measurement sensor (\ref{sec-fsensor}) and FFT-processed after applying a blackman window. The + our frequency measurement sensor (\ref{sec-fsensor}) and FFT-processed after applying a Blackman window. The vertical lines indicate \SI{50}{\hertz} and odd harmonics. We can see the expected peak at \SI{50}{\hertz} along with smaller peaks at odd harmonics. We can also see a number of spurious tones both between harmonics and at low frequencies. We can also see bands containing high noise energy around \SI{0.1}{\hertz}. This graph shows a high @@ -2080,7 +2101,7 @@ window respectively. \label{sec-ch-sim} To validate all layers of our communication stack from modulation scheme to cryptography we built a prototype -implementation in Python. Implementing all components in a high level language builds up familiartiy with the concepts +implementation in Python. Implementing all components in a high level language builds up familiarity with the concepts while taking away much of the implementation complexity. For our demonstrator we will not be able to use Python since our target platform is an inexpensive low-end microcontroller. Our demonstrator firmware will have to be written in a low-level language such as C or Rust. For prototyping these languages lack flexibility compared to Python. @@ -2088,7 +2109,7 @@ low-level language such as C or Rust. For prototyping these languages lack flexi To validate our modulation scheme we first performed a series of simulations on our Python demodulator prototype implementation. To simulate a modulated grid frequency signal we added noise to a synthetic modulation signal. For most simulations we used measured frequency data gathered with our frequency sensor. We only have a limited amount of capture -data. Re-using segements of this data as background noise in multiple simulation runs could lead to our simulation +data. Re-using segments of this data as background noise in multiple simulation runs could lead to our simulation results depending on individual features of this particular capture that would be common between all runs. To estimate the impact of this problem we re-ran some of our simulations with artificial random noise synthesized with a power spectral density matching that of our capture. To do this, we first measured our capture's PSD, then fitted a @@ -2096,7 +2117,7 @@ low-resolution spline to the PSD curve in log-log coördinates. We then generate spline with the DFT of the synthetic noise and performed an iDFT on the result. The resulting time-domain signal is our synthetic grid frequency data. Figure \ref{freq_meas_spectrum} shows the PSD of our measured grid frequency signal. The red line indicates the low-resolution log-log spline interpolation used for shaping our artificial noise. Figure -\ref{simulated_noise_spectrum} shows the PSD of our simulated signal overlayed with the same spline as a red line and +\ref{simulated_noise_spectrum} shows the PSD of our simulated signal overlaid with the same spline as a red line and shows time-domain traces of both simulated (blue) and reference signals (orange) at various time scales. Visually both signals look very similar, suggesting that we have found a good synthetic approximation of our measurements. @@ -2148,11 +2169,11 @@ ratio, margins in various parts of the demodulator decrease which statistically Our simulations yield smooth, reproducible SER curves with adequately low error bounds. This shows SER is related monotonically to the signal-to-noise margins inside our demodulator prototype. -\subsection{Sensitivity as a function of sequency length} +\subsection{Sensitivity as a function of sequence length} A basic parameter of our DSSS modulation is the length of the Gold codes used. The length of a Gold code is exponential in the code's bit count. Figure \ref{dsss_gold_nbits_overview} shows a plot of the symbol error rate of our demodulator -prototype depending on amplitude for each of five, six, seven and eigth-bit Gold sequences. In regions where symbol +prototype depending on amplitude for each of five, six, seven and eight bit Gold sequences. In regions where symbol error rate is neither clipping at $0$ nor at $1$ we can see the expected dependency that a $n+1$ bit Gold sequence at roughly twice the length yields roughly one half the SER. We can also observe a saturation effect: At low amplitudes, increasing the correlation length does not yield much benefit in SER anymore. In particular at a signal amplitude of @@ -2218,7 +2239,7 @@ $4.0$ to $5.5$. Figure \ref{dsss_thf_amplitude_5678} contains plots of demodulator sensitivity like the one in Figure \ref{dsss_gold_nbits_overview}. This time there is one color-coded trace for each threshold factor between $1.5$ and -$10.0$ in steps of $0.5$. We can see a clear dependency of demodulation performance from trheshold factor with both very +$10.0$ in steps of $0.5$. We can see a clear dependency of demodulation performance from threshold factor with both very low and very high values breaking the demodulator. The runaway traces that we can see at low threshold factors are artifacts of an implementation issue with our prototype code. We later fixed this issue in the demonstrator firmware in Section \ref{sec-demo-fw-impl}. For comparison purposes this issue do not matter. @@ -2258,7 +2279,7 @@ duration is specified in grid frequency sampling periods to ease implementation Figure \ref{chip_duration_sensitivity} shows the dependence of symbol error rate at a fixed good threshold factor from chip duration. The color bars indicate both chip duration translated to seconds real-time and the resulting symbol -duration at the given Gold code length. In the lower graphs we show the trace of ampltude at $\text{SER}=0.5$ over chip +duration at the given Gold code length. In the lower graphs we show the trace of amplitude at $\text{SER}=0.5$ over chip duration like we did in Figure \ref{dsss_thf_sensitivity_all_bits} for threshold factor. In both graphs we can see a faint optimum for very short chips with a decrease of sensitivity for long chips. This effect is due to longer chips moving the signal band into noisier spectral regions (cf.\ Figure \ref{freq_meas_spectrum}). @@ -2328,7 +2349,7 @@ the results for both are very close in absolute value. Chip duration/sensitivity simulation results like in Figure \ref{chip_duration_sensitivity} compared between a simulation using measured frequency data like in the previous graphs and one using artificially generated noise. There is little visible difference indicating that we have found a good model of reality in our noise - synthesizer, but also that real grid frequency behaves like a frequency-shaped gaussian noise process. + synthesizer, but also that real grid frequency behaves like a frequency-shaped Gaussian noise process. } \label{chip_duration_sensitivity_cmp} \end{figure} @@ -2355,7 +2376,7 @@ written a standards-compliant setup would consist of a comparatively feature-lim gateway (SMGW) containing all of the complex bidirectional protocol logic such as wireless or landline IP connectivity. The realistic target for a setup in this architecture would be the components of an SMGW such as its communication modem or main application processor. In the German architecture the smart meter does not even have to have a bi-directional -data link to the SMGW effectively mitigating any attack vector for remote compormise. +data link to the SMGW effectively mitigating any attack vector for remote compromise. Despite these considerations we still chose to reset the application MCU inside smart meter for two reasons. One is that SMGWs are much rarer on the second-hand market. The other is that SMGWs are a particular feature of the German @@ -2464,7 +2485,7 @@ upgrade and a remotely accessible disconnect switch. We based our safety reset demonstrator firmware on the grid frequency sensor firmware we developed in Section \ref{sec-fsensor}. We implemented DSSS demodulation by translating the Python prototype code we developed in Section \ref{sec-ch-sim} to embedded C code. After validating the C translation in extensive simulations we integrated our code -with a reed-solomon implementation and a libsodium-based implementation of the cryptographic protocol we designed in +with a Reed-Solomon implementation and a libsodium-based implementation of the cryptographic protocol we designed in Section \ref{sec-crypto}. To reprogram the target \texttt{MSP430} microcontroller we ported the low-level bitbang JTAG driver of \texttt{mspdebug}\footnote{\url{https://github.com/dlbeer/mspdebug}}. See Figure \ref{fig_demo_sig_schema} for a schematic overview of signal processing in our demonstrator. @@ -2493,10 +2514,20 @@ the receiver by equalization with a matched filter. \section{Experimental results} +\begin{figure} + \centering + \includegraphics[width=0.6\textwidth]{resources/prototype.jpg} + \caption{The completed prototype setup. The board on the left is the safety reset microcontroller. It is connected + to the smart meter in the middle through an adapter board. The top left contains a USB hub with debug interfaces to + the reset microcontroller. The cables on the bottom left are the debug USB cable and the \SI{3.5}{\milli\meter} + audio cable for the simulated mains voltage input.} + \label{fig_proto_pic} +\end{figure} + After extensive simulations and testing of the individual modules of our solution we proceeded to conduct a real-world -experiment. We tried the demonstrator setup with an emulated noisy DSSS signal in real-time. Our experiment went without -any issues and the firmware implementation correctly reset the demonstrator's meter. We were happy to see that our -extensive testing paid off: The demonstrator setup worked on its first try. +experiment. We tried the demonstrator setup in Figure \ref{fig_proto_pic} using an emulated noisy DSSS signal in +real-time. Our experiment went without any issues and the firmware implementation correctly reset the demonstrator's +meter. We were happy to see that our extensive testing paid off: The demonstrator setup worked on its first try. % FIXME add pictures of the finished demo setup in action % FIXME maybe add an SER curve here? @@ -2634,7 +2665,7 @@ the two is simple and one-way, it can be validated to a high standard of securit Despite these security benefits, the cost of such a separate hardware device might prove high in a mass-market rollout. In this case, one might attempt to integrate the reset controller into the core microcontroller in some way. Primarily, there would be two ways to accomplish this. One is a solution that physically integrates an additional microcontroller -core into the main application microcontroller package either as a submodule on the same die or as a separate die in a +core into the main application microcontroller package either as a module on the same die or as a separate die in a multi-chip module (MCM) with the main application microcontroller. A custom solution integrating both on a single die might be a viable path for very large-scale deployments but will most likely be too expensive in tooling costs alone to justify its use. More likely for a medium- to large-scale deployment of millions of meters would be a MCM integrating an @@ -2678,7 +2709,7 @@ foundations of the process based on an established model of inertial grid freque shown the viability of our end-to-end design through extensive simulations. To properly base these simulations we have developed a grid frequency measurement methodology comprising of a custom-designed hardware device for electrically safe data capture and a set of software tools to archive and process captured data. Our simulations show good behavior of our -broadcast communication system and give an indication that coöperating with a large consumer such as an aluminium +broadcast communication system and give an indication that coöperating with a large consumer such as an aluminum smelter would be a feasible way to set up a transmitter with very low hardware overhead. Based on our broadcast primitive we have developed a cryptographic protocol ready for embedded implementation in resource-constrained systems that allows triggering all or a selected subset of devices within a quick response time of less than 30 minutes. |