update sybil post

author: jaseg <git@jaseg.net> 2020-09-10 12:32:58 +0200
committer: jaseg <git@jaseg.net> 2020-09-10 12:32:58 +0200
commit: 9fc934f9d2830bbb897477d5211f9cb52fd2dc6b (patch)
tree: 939340c52ef6b77db347b04bc154cffe3b433bd8
parent: e13c0259dd94d22eebb6890953084e21103b5cca (diff)
download: blog-9fc934f9d2830bbb897477d5211f9cb52fd2dc6b.tar.gz
blog-9fc934f9d2830bbb897477d5211f9cb52fd2dc6b.tar.bz2
blog-9fc934f9d2830bbb897477d5211f9cb52fd2dc6b.zip
1 files changed, 70 insertions, 54 deletions
diff --git a/content/posts/sybil-resistance-identity/index.rst b/content/posts/sybil-resistance-identity/index.rst
index 869a782..1a18d78 100644
--- a/content/posts/sybil-resistance-identity/index.rst
+++ b/content/posts/sybil-resistance-identity/index.rst
@@ -1,5 +1,5 @@
 ---
-title: "Sybil Resistance and Digital Identity"
+title: "Theia Attack Resistance and Digital Identity"
 date: 2020-09-09T15:00:00+02:00
 ---
 
@@ -11,7 +11,7 @@ date: 2020-09-09T15:00:00+02:00
     </figure> 
 
 
-Sybil in Cyberspace
+Theia in Cyberspace
 ===================
 
 In informatics, the term *distributed system* is used to describe the aggregate behavior of a complex network made up of
@@ -19,33 +19,44 @@ individual computers. For decades, computer scientists to some success have been
 individual computers that make up such a distributed system need to be programmed for the resulting amalgamation to
 behave in a predictable, maybe even a desirable way. Though seemingly simple on its surface, this problem has a
 surprising depth to it that has yielded research questions for a whole field for several decades now. One particular
-as-of-yet unsolved problem is resistance against so-called *sybil attacks*. Named after the 1973 book by Flora Rheta
-Schreiber on dissociative identity disorder, in distributed systems a sybil attack is an attack where one computer
-acts to the rest of the network as if it were multiple, independent systems. The core insight is that there cannot be
-any technological way of preventing such an attack, and any practical countermeasure must be grounded in some authority
-or ground truth that is external to the systems—bridging from technology to its social or political context.
+as-of-yet unsolved problem is resistance against *theia attacks* (or "sybil" attacks in older terminology)*.
+
+   Named after the 1973 book by Flora Rheta Schreiber on dissociative identity disorder, a sybil attack is an
+   attack where one computer in a distributed system pretends to be multiple computers to gain an advantage. From my
+   standpoint, naming a type of computer security attack after a medical condition was an unfortunate choice. For this
+   reason this post uses the term *Theia attack* to refer to the same concept. This is named after a greek godess of
+   light and glitter and alludes to the attacker performs something alike an optical illusion, causing the attacked to
+   perceive multiple distinct images that in the end are all only reflections of the same attacker.
+
+The core insight of computer science research on theia attacks is that there cannot be any technological way of
+preventing such an attack, and any practical countermeasure must be grounded in some authority or ground truth that is
+external to the systems—bridging from technology to its social or political context.
 
 Looking around, we can see a parallel between this question ("which computer is a real computer?") and a social issue
 that recently has been growing in importance: Just like computers can pretend to be other computers, they can also
 pretend to be humans. As can humans. Be it within the context of election manipulation or down-to-earth astroturfing_
-the recurring issue is that in todays online communities, it is hard for an individual to tell who of their online
+the recurring issue is that in today's online communities, it is hard for an individual to tell who of their online
 acquaintances are who they seem to be. Different platforms attempt different solutions to this problem, and all fail in
 some way or another. Facebook employs good old snitching, turning people against each other and asking them "Do you know
-this person?". Twitter is more laid-back and instead of such Stasi_ methodology simply opts to require a working mobile
-phone number from its subjects, essentially short-circuiting identity verification to the phone company's check of their
+this person?". Twitter is more laid-back and avoids this Stasi_ methodology in favor of requiring a working mobile phone
+number from its subjects, essentially short-circuiting identity verification to the phone company's check of their
 subscriber's national passport.
 
+.. the preceding is a simplified representation of these platform's practices. In particular facebook uses several
+   methods depending on the case. I think this abbreviated discussion should be ok for the sake of the argument. I am
+   not 100% certain on the accuracy on the accuracy of the statement though. Does fb still do the snitching thing? Is
+   twitter usually content with a phone number?
+
 Trusting Crypto-Anarchist Authorities
 =====================================
 
-Beyond these centralistic solutions to the problem, crypto-anarchists and anarcho-capitalists have been brewing some
-interesting novel approaches to this issue based on *blockchain* distributed ledger technology. Distributed ledgers,
-often colloquially called "blockchains", are a distributed systems design pattern that yields a system that works like
-an append-only logbook. Participants with the right permissions can create new entries in this logbook, but
-noone—neither the original author, nor other participants—can retroactively change a logbook entry once it has been
-committed to the log. In the blockchain model, past entries are essentially written into stone. This near-perfect
-immutability is the property that opens them for a number of use cases from cryptographic pseudo-currencies
-[#cryptocurrency]_.
+Beyond these centralistic solutions to the problem, crypto-anarchists and anarcho-capitalists have been brewing on some
+interesting novel approaches to online identity based on *blockchain* distributed ledger technology. Distributed
+ledgers are a distributed systems design pattern that yields a system that works like an append-only logbook.
+Participants can create new entries in this logbook, but no one—neither the original author, nor other participants—can
+retroactively change a logbook entry once it has been written. In the blockchain model, past entries are essentially
+written into stone. This near-perfect immutability is what opens them for a number of use cases from cryptographic
+pseudo-currencies [#cryptocurrency]_.
 
 An overview over a variety of these unconventional blockchain identity verification approaches can be found in `this
 unpublished 2020 survey by Siddarth, Ivliev, Siri and Berman <https://arxiv.org/ftp/arxiv/papers/2008/2008.05300.pdf>`_.
@@ -61,16 +72,18 @@ social contacts. These computers then run an algorithm derived from the SybilGua
 of random-walk based algorithms. These algorithms assume that authentic social graphs are small world graphs: Everyone
 knows everyone else through a friend's friend's friend. They also assume that there is an upper bound on how many
 connections with authentic users an attacker can forge: Anyone who is not embedded into the graph well enough is cut
-out. Disregarding the catastrophic privacy issues of storing large amounts of data on social relationships on someone
-else's computer, this second assumption is where this model unfortunately breaks down. Applying common sense, it is
-completely realistic for an attacker to forge a large number of social connections: This is precisely what most of
-social media marketing is about! A more malicious angle on this would be to consider how in meatspace [#meatspacefn]_
-multi-level marketing schemes are successful in coaxing people to abuse their social graphs to disastrous consequences
-to the well-being of themselves and others. Similar schemes would certainly be possible in cyberspace as well.
-
-An additional point to consider is that the upper limit SybilGuard_ and others place on the number of fake identities
-one can have is simply not that strict at all. An attacker could still get away with a reasonable number of false
-identities before getting caught by any such algorithm.
+out. Like this, they put an upper limit on the number of theia identites an attacker can assume given a certian number
+of connections to real people.
+
+Disregarding the catastrophic privacy issues of storing large amounts of data on social relationships on someone else's
+computer, this second assumption is where this model unfortunately breaks down. Applying common sense, it is completely
+realistic for an attacker to forge a large number of social connections: This is precisely what most of social media
+marketing is about! A more malicious angle on this would be to consider how in meatspace [#meatspacefn]_ multi-level
+marketing schemes are successful in coaxing people to abuse their social graphs to disastrous consequences to the
+well-being of themselves and others. Similar schemes would certainly be possible in cyberspace as well.  An additional
+point to consider is that the upper limit SybilGuard_ and others place on the number of fake identities one can have is
+simply not that strict at all. An attacker could still get away with a reasonable number of false identities before
+getting caught by any such algorithm.
 
 .. Duniter
 
@@ -79,17 +92,20 @@ them, and who is at most a few degrees removed from one of several pre-determine
 vulnerable to conmen and other scammers, this system has the glaring flaw of roundly refusing to recognize any person
 who is not willing or able to engage with multiple of its members. Along with the system's informal requirement for
 members to only vouch for people they have physically met this leads to a nonstarter in a cyberspace that grown
-specifically *because* it transcends national borders and physical distance.
+specifically *because* it transcends national borders and physical distance—two most serious obstacles to in-person
+communication.
 
 .. Idena Network
 
-The last scheme I will outline in this post is based around a set of `Turing tests`_, that is, quizzes that are designed
+The last scheme I will outline in this post is based around a set of `Turing tests`_; that is, quizzes that are designed
 to tell apart man and machine. In this system, all participants have to simultaneously undergo a Turing test once in a
-fortnight. The system uses a particular type of picture classification-based Turing test and does not seem to be
-designed with the blind or mentally disabled in mind with accessibility concerns nowhere to be found in the so-called
-"manifesto" published by its creators. But even ignoring that, the system obviously fails at an even more basic level:
-The idea that everyone takes a Turing test at the same time only works in a world without time zones. Or jobs for that
-matter. Also, it assumes that an attacker cannot simply hire a small army of people someplace else to fool the system.
+fortnight. The idea is that this limits the number of theia identities an attacker can assume since they can only solve
+that many Turing tests at the same time. The system uses a particular type of picture classification-based Turing test
+and does not seem to be designed with the blind or mentally disabled in mind with accessibility concerns nowhere to be
+found in the so-called "manifesto" published by its creators. But even ignoring that, the system obviously fails at an
+even more basic level: The idea that everyone takes a Turing test at the same time only works in a world without time
+zones. Or jobs for that matter. Also, it assumes that an attacker cannot simply hire a small army of people someplace
+else to fool the system.
 
 .. _SybilLimit: https://www.comp.nus.edu.sg/~yuhf/yuh-sybillimit.pdf
 .. _SybilGuard: http://www.math.cmu.edu/~adf/research/SybilGuard.pdf
@@ -108,29 +124,29 @@ the important question remains unasked:
 
 Departing from all the systems outlined above, I want to make a suggestion on how we can approach this topic in a more
 practical, less discriminatory [#discriminatory]_ manner. I think both using people's social connections and proxying
-the decisions of external authorities such as the state are bad systems to decide who is a person and who is not. Let us
-now illustrate this point a bit. Let us think about how many digital identities a human beign might have. Let us first
+the decisions of external authorities such as the state are bad systems to decide who is a person and who is not. I will
+now illustrate this point a bit. Let us think about how many digital identities a human beign might have. First,
 consider the case of n=0, someone who simply wants no business with the system at all. For simplicity, let us assume
 that we have solved this issue of consent, i.e. every person who is identified by the system consents to this practice.
 For n=1, the approaches outlined above all provide some approximate solution. States may not grant every human
-sufficient ID (e.g. children, mentally disabled or prisoners might be left out), and the social systems might fail to
-catch people who simply do not have any friends, but otherwise their approximations hold. Maybe. But what about n=2,
+sufficient ID (e.g. children, the mentally disabled or prisoners might be left out), and the social systems might fail
+to catch people who simply do not have any friends, but otherwise their approximations hold. Maybe. But what about n=2,
 n=3, ...?  None of these systems adequately consider cases where a human being might legitimately wish to hold multiple
 identities, non-maliciously.
 
-Consider the case of a lesbian, conservative politician. An active social media presence is a core component of a modern
-politician's carreer. At the same time, "conservative homophobe" is still well within the realm of tautology and it
-would be legitimate for this politician to wish to not disclose this aspect of their private life to the world at large,
-and have a separate online identity for matters related to it.  For this politician, the social relationship-based
-systems referenced above would either have outing them as a design feature, or they would force them to choose either of
-these identities: Requiring them to choose between private life and carreer. When deferring to the state as the decider
-over personhood, at least the platform's operator would know about the outrageously sensitive link between the
-politician's online identities. Clearly, none of these systems are socially just.
+Consider a hypothetical lesbian, conservative politician. An active social media presence is a core component of a
+modern politician's carreer. At the same time, "conservative homophobe" is still well within the realm of tautology and
+it would be legitimate for this politician to wish to not disclose a large fraction of their private life to the world
+at large. They might have a separate online identity for matters related to it.  For this politician, the social
+relationship-based systems referenced above would either incorporate outing as a design feature, or they would force
+the politician to choose either of their two identities: To choose between private life and carreer. When deferring to
+the state as the decider over personhood, at least the platform's operator would know about the outrageously sensitive
+link between the politician's online identities. Clearly, no such solution can be considered socially just.
 
 Let us try not to be caught up on saving the world at this point. The issue of conservative homophobia is out of the
-scope of our consideration, and it is not one that anyone can solve in the near future. Least of all can true change be
-forced through contracts, legislation or other rules. There is a case for legitimate uses of multiple, separate digital
-identities, and we do not have a technical or political answer to it. All hope is not lost yet, though. We can easily
+scope of our consideration, and it is not one that anyone can solve in the near future. Magical realism aside, least of
+all can some technological thing beckon this change. There is a case for legitimate uses of multiple, separate digital
+identities, and we do not have a technical or political answer to it. All hope is not lost yet, though.  We can easily
 undo this gordian knot by acknowledging an unspoken assumption that underlies any social relationships between real
 people, past the procrustean bed of computer systems or organizational structures these relationships are cast into.
 
@@ -140,10 +156,10 @@ people, past the procrustean bed of computer systems or organizational structure
 Thinking beyond the straw man politician above, this is evident in more subtle ways in almost all our everyday
 relationships: Some people may know me by my legal name, some by my online nickname. To some I may be a computer
 scientist, to some a flatmate. None of my friends and acquaintances have ever wanted to see my passport, or asked to
-take my DNA to ascertain that I am in fact a differnet human than the others they know. It would simply be exceedingly
-weird for someone I know to snoop around the other people I know, trying to build a map of where these people know me
-from and whether they think the same about me. Yet, this concept of a consistent, global identity is exactly what up to
-now all technological solutions to the identity problem are about.
+take my DNA to ascertain that I am a distinct human being from the other humans they know. Also, it would simply be
+exceedingly weird for someone I know to snoop around the other people I know, trying to build a map of where these
+people know me from and whether they think the same about me. Yet, this concept of a single, consistent, global, true
+identity is exactly what up to now all technological solutions to the identity problem are trying to achieve.
 
 Building Bridges
 ================
author	jaseg <git@jaseg.net>	2020-09-10 12:32:58 +0200
committer	jaseg <git@jaseg.net>	2020-09-10 12:32:58 +0200
commit	9fc934f9d2830bbb897477d5211f9cb52fd2dc6b (patch)
tree	939340c52ef6b77db347b04bc154cffe3b433bd8
parent	e13c0259dd94d22eebb6890953084e21103b5cca (diff)
download	blog-9fc934f9d2830bbb897477d5211f9cb52fd2dc6b.tar.gz blog-9fc934f9d2830bbb897477d5211f9cb52fd2dc6b.tar.bz2 blog-9fc934f9d2830bbb897477d5211f9cb52fd2dc6b.zip