\documentclass[12pt,a4paper,notitlepage]{report} \usepackage[utf8]{inputenc} \usepackage[a4paper,textwidth=17cm, top=2cm, bottom=3.5cm]{geometry} \usepackage[T1]{fontenc} \usepackage[ backend=biber, style=numeric, natbib=true, url=true, doi=true, eprint=false ]{biblatex} \addbibresource{safety_reset.bib} \usepackage{amssymb,amsmath} \usepackage{listings} \usepackage{eurosym} \usepackage{wasysym} \usepackage{amsthm} \usepackage{tabularx} \usepackage{multirow} \usepackage{multicol} \usepackage{tikz} \usetikzlibrary{arrows} \usetikzlibrary{backgrounds} \usetikzlibrary{calc} \usetikzlibrary{decorations.markings} \usetikzlibrary{decorations.pathreplacing} \usetikzlibrary{fit} \usetikzlibrary{patterns} \usetikzlibrary{positioning} \usetikzlibrary{shapes} \usepackage{hyperref} \usepackage{tabularx} \usepackage{commath} \usepackage{graphicx,color} \usepackage{subcaption} \usepackage{float} \usepackage{footmisc} \usepackage{array} \usepackage[underline=false]{pgf-umlsd} \usetikzlibrary{calc} %\usepackage[pdftex]{graphicx,color} %\usepackage{epstopdf} % Needed for murks.tex \usepackage{setspace} \usepackage[draft=false,babel,tracking=true,kerning=true,spacing=true]{microtype} % optischer Randausgleich etc. % For german quotation marks \newcommand{\foonote}[1]{\footnote{#1}} \newcommand{\degree}{\ensuremath{^\circ}} \newcolumntype{P}[1]{>{\centering\arraybackslash}p{#1}} \begin{document} % Beispielhafte Nutzung der Vorlage für die Titelseite (bitte anpassen): \input{murks} \titel{FIXME} % Titel der Arbeit \typ{Masterarbeit} % Typ der Arbeit: Diplomarbeit, Masterarbeit, Bachelorarbeit \grad{Master of Science (M. Sc.)} % erreichter Akademischer Grad % z.B.: Master of Science (M. Sc.), Master of Education (M. Ed.), Bachelor of Science (B. Sc.), Bachelor of Arts (B. A.), Diplominformatikerin \autor{Jan Sebastian Götte} \gebdatum{Aus datenschutzrechtlichen Gründen nicht abgedruckt} % Geburtsdatum des Autors \gebort{Aus datenschutzrechtlichen Gründen nicht abgedruckt} % Geburtsort des Autors \gutachter{Prof. Dr. Björn Scheuermann}{FIXME} % Erst- und Zweitgutachter der Arbeit \mitverteidigung % entfernen, falls keine Verteidigung erfolgt \makeTitel \selbstaendigkeitserklaerung{31.03.2020} \newpage % Hier folgt die eigentliche Arbeit (bei doppelseitigem Druck auf einem neuen Blatt): \tableofcontents \newpage \chapter{Introduction} \section{Structure and operation of the electrical grid} \subsection{Structure of the electrical grid} \subsubsection{Generators and loads} \subsubsection{Transformers} \subsubsection{Tie lines} \subsection{Operational concerns} \subsubsection{Modelling the electrical grid} \subsubsection{Generator controls} \subsubsection{Load shedding} \subsubsection{System stability} \subsubsection{Power System Stabilizers} \subsubsection{Smart metering} \section{Smart meter technology} \subsubsection{Common components} Smart meters usually are built around a standard microcontroller. \label{sm-cpu} \subsubsection{Cryptographic coprocessors} \subsubsection{Physical structure} \subsubsection{Physical installation} \section{Regulatory frameworks around the world} \subsection{International standards} \subsection{Regulations in Europe} \subsection{The regulatory situation in Germany} \subsection{The regulatory situation in France} \subsection{The regulatory situation in the UK} \subsection{The regulatory situation in Italy} \subsection{The regulatory situation in northern America} \subsection{The regulatory situation in Japan} \subsection{Common themes} \section{Security in smart grids} The smart grid in practice is nothing more or less than an aggregation of embedded control and measurement devices that are part of a large control system. This implies that all the same security concerns that apply to embedded systems in general also apply to most components of a smart grid in some way. Where programmers have been struggling for decades now with input validation\cite{leveson01}, the same potential issue raises security concerns in smart grid scenarios as well\cite{mo01, lee01}. Only, in smart grid we have two complicating factors present: Many components are embedded systems, and as such inherently hard to update. Also, the smart grid and its control algorithms act as a large (partially-)distributed system, making problems such as input validation or authentication difficult to implement\cite{blaze01} and adding a host of distributed systems problems on top\cite{lamport01}. Given that the electrical grid is a major piece of essential infrastructure in modern civilization, these problems amount to significant issues in practice. Attacks on the electrical grid may have grave consequences\cite{lee01} all the while the long maintenance cycles of various components make the system slow to adapt. Thus, components for the smart grid need to be built to a much higher standard of security than most consumer devices to ensure they live up to well-funded attackers even decades down the road. This requirement intensifies the challenges of embedded security and distributed systems security among others that are inherent in any modern complex technological system. A point we will not consider in much depth is theft of electricity. A large part of the motivation of the introduction of smart meters seems to be % TODO weak statement to reduce the level of fraud by consumers. Academic papers tend to either focus on other benefits such as generation efficiency gains through better forecasting or try to rationalize the funamentally anti-consumer nature of smart metering with strenuous claims of ``enormous social benefits''\cite{mcdaniel01}. We will entirely focus on grid stability and discard electricity theft in the context of this paper for two reasons: One, billing inaccuracies of electricity companies are of very low urgency compared to grid stability, and the one is a precondition for the other. Two, utility companies can already put strong bounds on the amount of theft by simply cross-refrencing meter readings against trusted readings from upstream sections of the grid. This capability works even without smart meters and only gains speed from smart meters, just as the old exploit of bypassing the meter with a section of wire can't be prevented like this. Due to these bounds on its volume, electricity theft using smart meter hacking would not scale. Hackers would simply be rooted up one by one with no damage to consumers and very limmited damage to utility companies. Damage in these scenarios would be a far cry from the efficiency of an exponentially growing botnet. \subsection{Smart grid components as embedded devices} A fundamental challenge in smart grid implementations is the central role smart electricity meters play. Smart meters are used both for highly-granular load measurement and (in some countries) load switching\cite{zheng01}. Smart electricity meters are effectively consumer devices. They are built down to a certain price point that is measured by the burden it puts on consumers and that is generally fixed by regulatory authorities. % FIXME cite This requirement precludes some hardware features such as the use of a standard hardened software environment on a high-powerded embedded system (such as a hypervirtualized embedded linux setup) that would both increase resilience against attacks and simplify updates. Combined with the small market sizes in smart grid deployments \footnote{ Most vendors of smart electricity meters only serve a handful of markets. For the most part, smart meter development cost lies in the meter's software % TODO cite? There exist multiple competing standards applicable to various parts of a smart electricity meter. In addition, most countries have their own certification regimen\cite{cenelec01}. This complexity creates a large development burden for new market entrants. } this produces a high cost pressure on the software development process for smart electricity meters. \subsection{The state of the art in embedded security} Embedded security generally is much harder than security of higher-level systems. This is due to a combination of the unique constraints of embedded devices (hard to update, usually small quantity) and their lack of capabilities (processing power, memory protection functions, user interface devices). Even very well-funded companies continue to have serious problems securing their embedded systems. A spectacular example of this difficulty is the recently-exposed flaw in Apple's iPhone SoC first-stage ROM bootloader\footnote{ Modern system-on-chips integrate one or several CPUs with a multitude of peripherals, from memory and DMA controllers over 3D graphics accelerators down to general-purpose IO modules for controlling things like indicator LEDs. Most SoCs boot from one of several boot devices such as flash memory, ethernet or USB according to a configuration set e.g. by connecting some SoC pins a certain way or set by device-internal write-only fuse bits. Physically, one of the processing cores of the SoC (usually one of the main CPU cores) is connected such that it is taken out of reset before all other devices, and is tasked with switching on and configuring all other devices of the SoC. In order to run later intialization code or more advanced bootloaders, this core on startup runs a very small piece of code hard-burned into the SoC in the factory. This ROM loader initializes the most basic peripherals such as internal SRAM memory and selects a boot device for the next bootloader stage. Apple's ROM loader performs some authorization checks, to ensure no unauthorized software is loaded. The present flaw allows an attacker to circumvent these checks, booting code not authorized by Apple on a USB-connected iPhone, compromising Apple's chain of trust from ROM loader to userland right at its root. }, that allows a full compromise of any iPhone before the iPhone X. iPhone 8, one of the affected models, is still being manufactured and sold by Apple today\footnote{ i.e. at the time this paragraph was written, on %FIXME }. In another instance, Samsung put a flaw in their secure-world firmware used for protection of sensitive credentials in their mobile phone SoCs in % FIXME year % . If both of these very large companies have trouble securing parts of their secure embedded software stacks measuring a mere few hundred bytes in Apple's case or a few kilobytes in Samsung's, what is a smart electricity meter manufacturer to do? For their mass-market phones, these two companies have R\&D budgets that dwarf some countries' national budgets. % FIXME hyperbole? % FIXME cite Since thorough formal verification of code is not yet within reach for either large-scale software development or code heavy in side-effects such as embedded firmware or industrial control software\cite{pariente01} the two most effective measures for embedded security is reducing the amount of code on one hand, and labour-intensively checking and double-checking this code on the other hand. A smart electricity manufacturer does not have a say in the former since it is bound by the official regulations it has to comply with, and will almost certainly not have sufficient resources for the latter. % FIXME expand? % FIXME cite some figures on code size in smart meter firmware? \subsection{Attack avenues in the smart grid} If we model the smart grid as a control system responding to changes in inputs by regulating outputs, on a very high level we can see two general categories of attacks: Attacks that directly change the state of the outputs, and attacks that try to influence the outputs indirectly by changing the system's view of its inputs. The former would be an attack such as one that shuts down a power plant to decrease generation capacity. The latter would be an attack such as one that forges grid frequency measurements where they enter a power plant's control systems to provoke increasing oscillation in the amount of power generated by the plant according to the control systems' directions. % FIXME cite % FIXME expand \subsubsection{Communication channel attacks} Communication channel attacks are attacks on the communication links between smart grid components. This could be attacks on IP-connected parts of the core network or attacks on shared busses between smart meters and IP gateways in substations. Generally, these attacks can be mitigated by securing the aforementioned communication links using modern cryptography. IP links can be protected using TLS, and more low-level busses can be protected using more lightweight Noise-based protocols. % FIXME cite Cryptographic security transforms an attackers ability to manipulate communication contents into a mere denial of service attack. Thus, in addition to cryptographic security safety under DoS conditions must be ensured to ensure continued system performance under attacks. This safety property is identical with the safety required to withstand random outages of components, such as communications link outages due to physical damage from storms, flooding etc. % FIXME cite papers on attack impact, on coutermeasures and on attack realization In general, attacks at the meter level may be hard to weaponize % may be -> weak statement? since meters are used mostly for billing and forecasting purposes % FIXME cite and for more critical grid control purposes there exist several additional layers of sensors above smart meters that limit how much an attacker can falsify smart meter readings without the manipulation being obvious. In order for an attack to have more far-reaching consequences the attacker would need to compromise additional grid infrastructure\cite{kim01,kosut01}. \subsubsection{Exploiting centralized control systems} The type of smart grid attack most often cited in popular discourse, and to the author's knowledge % FIXME verify, cite the only type that has so far been conducted in practice, is a direct attack on centralized control systems. In this attack, computer components of control systems are compromised by the same techniques used to compromise any other kind of computer system such as exploiting insecure services running on internet-exposed ports and using one compromised system to compromised other systems connected with it through an ostensably secure internal network. These attacks are very powerful as they yield the attacker direct control over whatever outputs the control systems are controlling. If an attacker manages to compromise a power stations control computers, they may be able to influence generation output or even cause an emergency shutdown. % FIXME Despite their potentially large impact, these attacks are only moderately interesting from a scientific perspective. For one, their mitigation mostly consists of a straightforward application of security practices well-known for decades. Though there is room for the implementation of genuinely new, application-specific security systems in this field, the general state of the art is lacking behind the rest of the computer industry such that the low-hanging fruit should take priority. % FIXME cite this bold claim very properly In addition, given political will these systems can readily be secured since there is only a comparatively small number of them and driving a technician to every one of them in turn to install some security update is perfectly feasible. \subsubsection{Control function exploits} Control function exploits are attacks on the mathematical control loops used by the centralized control system. One example of such an attack would be resonance attacks as described in \textcite{wu01}. In this kind of attack, inputs from peripheral sensors indicating grid load to the centralized control system are carefully modified to cause a disproportionally large oscillation in control system action. This type of attack relies on complex resonance effects that arise when mechanical generators are electrically coupled. These resonances, coloquially called ``modes'' are well-studied in power system engineering\cite{rogers01,grebe01,entsoe01}. % FIXME: refer to section on stability control above here Even disregarding modern attack scenarios, for stability electrical grids are designed with measures in place to dampen any resonances inherent to grid structure. Still, requiring an accurate grid model these resonances are hard to analyze and unlikely to be noiticed under normal operating conditions. Mitigation of these attacks is most easily done by on the one hand ensuring unmodified sensor inputs to the control systems in the first place, and on the other hand carefully designing control systems not to exhibit exploitable behavior such as oscillations. % FIXME cite mitigation approaches \subsubsection{Endpoint exploits} One rather interesting attack on smart grid systems is one exploiting the grid's endpoint devices such as smart electricity meters\footnote{ Though potentially this could also aim at other kinds of devices distributed on a large scale such as sensors in unmanned substations. % FIXME cite verify } These meters are deployed on a massive scale, with several thousand meters deployed for every substation. % FIXME cite (this should be straightforward) Thus, once compromised restoration to an uncompromised state can be potentially very difficult if it requires physical access to thousands of devices hidden inaccessible in private homes. By compromising smart electricity meters, an attacker can trivially forge the distributed energy measurements these devices perform. In a best-case scenario, this might only affect billing and lead to customers being under- or over-charged if the attack is not noticed in time. However, in a less ideal scenario the energy measurements taken by these devices migth be used to inform the grid centralized control systems % FIXME cite and a falsification of these measurements might lead to inefficiency. In some countries and for some customers, these smart meters have one additional function that is highly useful to an attacker: They contain high-current load switches to disconnect the entire household or business in case electricity bills are left unpaid for a certain period. In countries that use these kinds of systems, the load disconnect is often simply hooked up to one of the smart merter's central microcontroller's general-purpose IO pins, allowing anyone compromising this microcontroller's firmware to actuate the load switch at will. % FIXME validate cite add pictures Given control over a large number of network-connected smart meters, an attacker might thus be able to cause large-scale disruptions of power consumption by repeatedly disconnecting and re-connecting a large number of consumers. % FIXME cite some analysis of this Combined with an attack method such as the resonance attack from \textcite{wu01} that was mentioned above, this scenario poses a serious danger to grid stability. % FIXME add small-scale load shedding for heaters etc. \subsection{Attacker models in the smart grid} \subsection{Practical attacks} \subsection{Practical threats} \subsection{Conclusion, or why we are doomed} We can conclude that a compromise of a large number of smart electricity meters cannot be ruled out. The complexity of network-connected smart meter firmware makes it exceedingly unlikely that it is in fact flawless. Large-scale deployments of these devices under some circumstances such as where they are used with load disconnect relays make them an attractive target for attackers interested in causing grid instability. The attacker model for these devices very definitely includes enemy states, who have considerable resources at their disposal. For a reasonable guarantee that no large-scale compromises of hard- and software built today will happen over a span of some decades, we would have to radically simplify its design and limit attack surface. Unfortunately, the complexity of smart electricity meter implementations mostly stems from the large list of requirements these devices have to conform with. Additionally, standards have already been written and changes that reduce scope or functionality have become exceedingly unlikely at this point. A general observation with smart grid systems of any kind is that they comprise a zealous departure of the decentralized control structure of yesterday's dumb grid and the advent of centralization at an enormous scale. This modern, centralized infrastructure has been carefully designed to defend against malicious actors%FIXME cite and all involved parties have an interest in keeping it secure. Still, like in any other system this centralization also makes a very attractive target for attackers since an attacker can likewise employ this centralized control to their goals. Fundamentally, decentralized systems tend to make attacks of any kind a lot more costly and one might question whether security has truly been gained during smart grid rollout. % FIXME hot take maybe \chapter{Restoring endpoint safety in an age of smart devices} If as layed out in the previous paragraph we cannot rule out a large-scale compromise of smart energy meters, we have to rephrase our claim to security. If we cannot rule out exploitation, we have to limit its impact. If we assume that we cannot strip any functionality from smart meters since it may be required by standards or for enormous social benefits\cite{mcdaniel01} % FIXME is sarcasm ok here? all we can do is to flush out an attacker once they are in. In a worst-case scenario an attacker would gain unconstrained code execution e.g. by exploiting a flaw in a network protocol implentation. Since smart meters use standard microcontrollers that do not have advanced memory protection functions (see pg. \ref{sm-cpu}), at this point we can assume the attacker has full control over the main microcontroller. With this control they can actuate the load switch if present, transmit data through the device's communication interfaces or use the user interface components such as LEDs and the LCD. Using the self-programming capabilities of modern flash microcontrollers, an attacker may even gain persistency without much trouble. Note that in systems separating cryptographic functions into some form of cryptographic module such as systems used in Germany % TODO list other countries as well? FIXME cite BSI standard requiring this we can be optimistic and assume the attacker has not in fact compromised this cryptographic co-processor yet and does not have access to any cryptographic secrets yet. Given that the attacker has complete control over the meter's core microcontroller and given that due to cost constraints we are bound to use whatever microcontroller the meter OEM has chosen for their design, we cannot rely on software running on the core mircocontroller to restore system integrity. Our solution to this problem is to add another, very small microcontroller to the smart meter design. This microcontroller will contain a small piece of software to receive cryptographically authenticated commands from utility companies and on demand reset the meter's core microcontroller to a known-good state. We have to assume the code in the core controller's flash memory has been compromised, so our only option to flush out an attacker is to re-program the core microcontroller in its entirety. We propose using JTAG to re-program the core microcontroller % TODO get terminology consistent. Is "core microcontroller" a good term here? with a known-good firmware image read from a sufficiently large SPI flash connected to the reset controller. JTAG is supported by most microcontrollers complex enough to end up in a smart meter design % TODO colloquialism and given adequate documentation JTAG programming functionality can be ported to new microcontrollers with relatively little work. On the microcontroller side our solution requires the JTAG interface to be activated (i.e. not fused-shut) and for our solution to work core microcontroller firmware must not be able to permanently disable the JTAG interface from within. In microcontrollers that do not yet provide this functionality this is a minor change that could be added to a custom microcontroller variant at low cost. On most microcontrollers keeping JTAG open should not interfere with code readout protection. Code secrecy should be of no concern\cite{schneier01} here but besides security manufacturers have strong preferences about this due to fear of copyright infringement. \section{The theory of endpoint safety} In order to gain anything by adding our reset controller to the smart meter's already complex design we must satisfy two conditions. \begin{enumerate} \item \textbf{security} means our reset controller itself does not have any exploitable flaws \item \textbf{safety} menas our reset controller will perform its job as intended \end{enumerate} % FIXME expand \subsection{Attack characteristics} \subsection{Complex microcontroller firmware} \subsection{Modern microcontroller hardware} \subsection{Regulatory and economical constraints} \subsection{Safety vs. Security: Opting for restoration instead of prevention} \subsection{Technical outline of a safety reset} \section{Communication channels on the grid} \subsection{Powerline communication systems and their use} \subsection{Proprietary wireless systems} \subsection{Landline IP} \subsection{IP-based wireless systems} \subsection{Frequency modulation as a communication channel} \subsubsection{The frequency dependance of grid frequency} \subsubsection{Control systems coupled to grid frequency} \subsubsection{Avoiding dangerous modes} \subsubsection{Overall system parameters} \subsubsection{An outline of practical implementation} \section{From grid frequency to a reliable communications channel} \subsection{Channel properties} \subsection{Modulation and its parameters} \subsection{Error-correcting codes} \subsection{Cryptographic security} \chapter{Practical implementation} \section{Cryptographic validation} \section{Data collection for channel validation} \subsection{Frequency sensor hardware design} \subsection{Frequency sensor measurement results} \section{Channel simulation and parameter validation} \section{Implementation of a demonstrator unit} \section{Experimental results} \section{Lessons learned} \chapter{Future work} \section{Technical standardization} \section{Regulatory adoption} \section{Practical implementation} \section{Zones of trust} In our design, we opted for a safety reset controller % FIXME is "safety reset" the proper name here? We need some sort of branding, but is this here really about "safety"? in form of a separate micocontroller entirely separate from whatever application microcontroller the smart meter design is already using. This design nicely separates the meter into an untrusted application (the core microcontroller) and the trusted reset controller. Since the interface between the two is simple and logically one-way, it can be validated to a high standard of security. Despite these security benefits, the cost of such a separate hardware device might prove high in a mass-market rollout. In this case, one might attempt to integrate the reset controller into the core microcontroller in some way. Primarily, there would be two ways to accomplish this. % separate die/submodule % trustzone \newpage \appendix \chapter{Acknowledgements} \newpage \chapter{References} \nocite{*} \printbibliography \newpage \chapter{Demonstrator schematics and code} \chapter{Economic viability of countermeasures} \section{Attack cost} \section{Countermeasure cost} \section{Conclusion} \chapter{The ethics and security implications of centralized crackdown on energy theft} \end{document}