summaryrefslogtreecommitdiff
path: root/content/blog/sybil-resistance-identity/index-old.rst
diff options
context:
space:
mode:
Diffstat (limited to 'content/blog/sybil-resistance-identity/index-old.rst')
-rw-r--r--content/blog/sybil-resistance-identity/index-old.rst246
1 files changed, 0 insertions, 246 deletions
diff --git a/content/blog/sybil-resistance-identity/index-old.rst b/content/blog/sybil-resistance-identity/index-old.rst
deleted file mode 100644
index 734cc5a..0000000
--- a/content/blog/sybil-resistance-identity/index-old.rst
+++ /dev/null
@@ -1,246 +0,0 @@
----
-title: "Theia Attack Resistance and Digital Identity"
-date: 2020-09-09T15:00:00+02:00
-draft: true
----
-
-.. raw:: html
-
- <figure class="header">
- <img src="images/succulents.jpg">
- <figcaption>Photo by <a href="https://unsplash.com/@timbennettcreative">Tim Bennett</a> on
- <a href="https://unsplash.com/">Unsplash</a></figcaption>
- </figure>
-
-
-Theia in Cyberspace
-===================
-
-In informatics, the term *distributed system* is used to describe the aggregate behavior of a complex network made up of
-individual computers. For decades, computer scientists to some success have been trying to figure out how exactly the
-individual computers that make up such a distributed system need to be programmed for the resulting amalgamation to
-behave in a predictable, maybe even a desirable way. Though seemingly simple on its surface, this problem has a
-surprising depth to it that has yielded research questions for a whole field for several decades now. One particular
-as-of-yet unsolved problem is resistance against *theia attacks* (or "sybil" attacks in older terminology).
-
- Named after the 1973 book by Flora Rheta Schreiber on dissociative identity disorder, a sybil attack is an
- attack where one computer in a distributed system pretends to be multiple computers to gain an advantage. From your
- author's standpoint, naming a type of computer security attack after a medical condition was an unfortunate choice.
- For this reason this post uses the term *Theia attack* to refer to the same concept. Theia is a greek godess of light
- and glitter and the name alludes to the attacker performing something alike an optical illusion, causing the attacked
- to perceive multiple distinct images that in the end are all only reflections of the same attacker.
-
-The core insight of computer science research on theia attacks is that there cannot be any technological way of
-preventing such an attack, and any practical countermeasure must be grounded in some authority or ground truth that is
-external to the systems—bridging from technology to its social or political context.
-
-Looking around, we can see a parallel between this question ("which computer is a real computer?") and a social issue
-that recently has been growing in importance: Just like computers can pretend to be other computers, they can also
-pretend to be humans. As can humans. Be it within the context of election manipulation or down-to-earth astroturfing_
-the recurring issue is that in today's online communities, it is hard for an individual to tell who of their online
-acquaintances are who they seem to be. Different platforms attempt different solutions to this problem, and all fail in
-some way or another. Facebook employs good old snitching, turning people against each other and asking them "Do you know
-this person?". Twitter is more laid-back and avoids this Stasi_ methodology in favor of requiring a working mobile phone
-number from its subjects, essentially short-circuiting identity verification to the phone company's check of their
-subscriber's national passport.
-
-.. the preceding is a simplified representation of these platform's practices. In particular facebook uses several
- methods depending on the case. I think this abbreviated discussion should be ok for the sake of the argument. I am
- not 100% certain on the accuracy on the accuracy of the statement though. Does fb still do the snitching thing? Is
- twitter usually content with a phone number?
-
-Trusting Crypto-Anarchist Authorities
-=====================================
-
-Beyond these centralistic solutions to the problem, crypto-anarchists and anarcho-capitalists have been brewing on some
-interesting novel approaches to online identity based on *blockchain* distributed ledger technology. Distributed
-ledgers are a distributed systems design pattern that yields a system that works like an append-only logbook.
-Participants can create new entries in this logbook, but no one—neither the original author, nor other participants—can
-retroactively change a logbook entry once it has been written. In the blockchain model, past entries are essentially
-written into stone. This near-perfect immutability is what opens them for a number of use cases from cryptographic
-pseudo-currencies [#cryptocurrency]_.
-
-An overview over a variety of these unconventional blockchain identity verification approaches can be found in `this
-unpublished 2020 survey by Siddarth, Ivliev, Siri and Berman <https://arxiv.org/ftp/arxiv/papers/2008/2008.05300.pdf>`_.
-They walk their readers through a number of different projects that try to solve the question "Is this human who they
-pretend to be?" using joint socio-technological approaches. In the following few sections, you may find a short outline
-of a small selection of them. The conlusion of this post will be a commentary on these approaches, and on the underlying
-problem of identity in a digital world.
-
-.. BrightID
-
-In one scheme, identity is determined by "notary" computers that aggregate large amounts of information on a user's
-social contacts. These computers then run an algorithm derived from the SybilGuard_, SybilLimit_ and SybilInfer_ lineage
-of random-walk based algorithms. These algorithms assume that authentic social graphs are small world graphs: Everyone
-knows everyone else through a friend's friend's friend. They also assume that there is an upper bound on how many
-connections with authentic users an attacker can forge: Anyone who is not embedded into the graph well enough is cut
-out. Like this, they put an upper limit on the number of theia identites an attacker can assume given a certian number
-of connections to real people.
-
-Disregarding the catastrophic privacy issues of storing large amounts of data on social relationships on someone else's
-computer, this second assumption is where this model unfortunately breaks down. Applying common sense, it is completely
-realistic for an attacker to forge a large number of social connections: This is precisely what most of social media
-marketing is about! A more malicious angle on this would be to consider how in meatspace [#meatspacefn]_ multi-level
-marketing schemes are successful in coaxing people to abuse their social graphs to disastrous consequences to the
-well-being of themselves and others. Similar schemes would certainly be possible in cyberspace as well. An additional
-point to consider is that the upper limit SybilGuard_ and others place on the number of fake identities one can have is
-simply not that strict at all. An attacker could still get away with a reasonable number of false identities before
-getting caught by any such algorithm.
-
-.. Duniter
-
-In another scheme, identity is awarded to anyone who can convince several people already in the network to vouch for
-them, and who is at most a few degrees removed from one of several pre-determined celebrities. Apart from again being
-vulnerable to conmen and other scammers, this system has the glaring flaw of roundly refusing to recognize any person
-who is not willing or able to engage with multiple of its members. Along with the system's informal requirement for
-members to only vouch for people they have physically met this leads to a nonstarter in a cyberspace that grown
-specifically *because* it transcends national borders and physical distance—two most serious obstacles to in-person
-communication.
-
-.. Idena Network
-
-The last scheme I will outline in this post is based around a set of `Turing tests`_; that is, quizzes that are designed
-to tell apart man and machine. In this system, all participants have to simultaneously undergo a Turing test once in a
-fortnight. The idea is that this limits the number of theia identities an attacker can assume since they can only solve
-that many Turing tests at the same time. The system uses a particular type of picture classification-based Turing test
-and does not seem to be designed with the blind or mentally disabled in mind with accessibility concerns nowhere to be
-found in the so-called "manifesto" published by its creators. But even ignoring that, the system obviously fails at an
-even more basic level: The idea that everyone takes a Turing test at the same time only works in a world without time
-zones. Or jobs for that matter. Also, it assumes that an attacker cannot simply hire a small army of people someplace
-else to fool the system.
-
-.. _SybilLimit: https://www.comp.nus.edu.sg/~yuhf/yuh-sybillimit.pdf
-.. _SybilGuard: http://www.math.cmu.edu/~adf/research/SybilGuard.pdf
-.. _SybilInfer: https://www.princeton.edu/~pmittal/publications/sybilinfer-ndss09.pdf
-.. _`Turing Tests`: https://en.wikipedia.org/wiki/Turing_test
-
-Identity between Cyberspace and Meatspace
-=========================================
-
-A common thread in these solutions, from the Facebook'esque Stasi_ methods to the crypto-anarchist challenge-response
-utopias, is that they all approach digital identity as a question of Objective Truth™ that can unanimously be decided at
-a system level—or that can be externalized to the next larger system such as the state. Alas, the important question
-remains unasked:
-
- What *is* identity?
-
-The answer to this question certainly depends on the system being examined. For example, an important reason the
-capitalist corporations mentioned above require knowledge about their users' identity is to generate plausible
-statistics for the advertisers that form their customer base, similar to how a farmer will keep statics on yield and
-quality for the buyers of his crop. With this background, a full decoupling of platform accounts from a notion of legal
-identity seems at odds with the platform's business model—and we will have to adjust our expectations for reform
-accordingly.
-
-A common thread among all systems mentioned above is that they all have a social component to them. For this common use
-case of social systems, I want to make a suggestion on how we can approach digital identity in a more practical, less
-discriminatory [#discriminatory]_ manner than any of the methods we discussed above. I think both using people's social
-connections and proxying the decisions of external authorities such as the state are bad systems to decide who is a
-person and who is not. I will now illustrate this point a bit. Let us think about how many digital identities a human
-beign might have. First, consider the case of n=0, someone who simply wants no business with the system at all. For
-simplicity, let us assume that we have solved this issue of consent, i.e. every person who is identified by the system
-consents to this practice. For n=1, the approaches outlined above all provide some approximate solution. States may not
-grant every human sufficient ID (e.g. children, the mentally disabled or prisoners might be left out), and the social
-systems might fail to catch people who simply do not have any friends, but otherwise their approximations hold. Maybe.
-But what about n=2, n=3, ...? None of these systems adequately consider cases where a human being might legitimately
-wish to hold multiple digital identities, non-maliciously.
-
-Consider a hypothetical lesbian, conservative politician. An active social media presence is a core component of a
-modern politician's carreer. At the same time, "conservative homophobe" is still well within the realm of tautology and
-it would be legitimate for this politician to wish to not disclose a large fraction of their private life to the world
-at large. They might have a separate online identity for matters related to it. For this politician, the social
-relationship-based systems referenced above would either incorporate outing as a design feature, or they would force
-the politician to choose either of their two identities: To choose between private life and carreer. When deferring to
-the state as the decider over personhood, at least the platform's operator would know about the outrageously sensitive
-link between the politician's online identities. Clearly, no such solution can be considered socially just.
-
-Let us try not to be caught up on saving the world at this point. The issue of conservative homophobia is out of the
-scope of our consideration, and it is not one that anyone can solve in the near future. Magical realism aside, least of
-all can some technological thing beckon this change. There is a case for legitimate uses of multiple, separate digital
-identities, and we do not have a technical or political answer to it. All hope is not lost yet, though. We can easily
-undo this gordian knot by acknowledging an unspoken assumption that underlies any social relationships between real
-people, past the procrustean bed of computer systems or organizational structures these relationships are cast into.
-
- As a function of social interaction, digital identities conform to roles_ in sociological terminology, and are not
- at all the same as personhood_. Roles are subjective and arise from a relationship between people, and a single
- person might legitimately perform different roles depending on context.
-
-When computer scientists or programmers are creating new systems, there always is an (often implicit) modelling stage.
-Formally, during this stage a domain expert and a modeller with a computer science background come together, each
-contributing their knowledge to form a model that is both appropriate for real-world use and practical from an
-engineering point of view. In practice, these two roles are often necessarily fulfilled by the same person, who is often
-also the programmer of the thing. This leads to many computer systems using poor models. A typical example of this issue
-are systems requiring a person's name that use three input fields labelled "First Name", "Middle Initial" and "Last
-Name". These systems are often created by US-American programmers, who are used to this naming schema from their lived
-experience. Unfortunately, this schema breaks down for those few billion people who use their last name first, who have
-more than one middle name, or who have multiple given names and do not normally use the first one of those.
-
-Once a system creator's implicit assumptions have been encoded into the system like this, it is often very hard to get
-out of that situation. A pattern to use during careful modelling is to keep the model flexible to account for unforeseen
-corner cases. For example, when modelling a system requiring a person's name, one would have to ask what the name is
-used for. It may be the most sensible decision to simply ask the user for their name twice: Once in first name/last name
-format for e.g. tax purposes, and once with a free-form text field for e.g. displaying on their account page.
-
-While for names, many systems already use some form of flexible model by e.g. having a *handle* or *nickname* separate
-from the *display name*, "social" systems still often are stuck with an identity model based around a concept of a
-single, rigid identity. In practice, people perform different roles_ in different circumstances. When asking for a
-person's identity, one would get wildly different answers from different people. A person's identity as perceived by
-others is coupled to their relationship more than to some underlying, biological or administrative truth. Thinking back
-to the straw man politician above, this is evident in subtle ways in almost all our everyday relationships: Some people
-may know me by my legal name, some by my online nickname. To some I may be a computer scientist, to some a flatmate.
-None of my friends and acquaintances have ever wanted to see my passport, or asked to take my DNA to ascertain that I am
-a distinct human being from the other humans they know. Likewise, identifying me by my social connections is impractical
-as it would require an exceedingly weird amount of what can only be described as snooping. Yet, this concept of a
-single, consistent, global, true identity is exactly what up to now all technological solutions to the identity problem
-are trying to achieve.
-
-Building Bridges
-================
-
-I think I can offer you one main take-aways from the discussion above.
-
- During modelling social systems, focus on relationships—not identity.
-
-Rephrased into more actionable points, as someone designing a social digital system, do the following:
-
-0. Early in the design stages, take the time to consider fundamental modelling issues like this one. If you don't, you
- will likely get stuck with a sub-optimal model that will be hard to get rid of.
-1. Where possible, be flexible. Allow people to chose their own identifier. Don't require them to use their real names,
- they may not wish to disclose those or they may not be in a format that is useful to you (they may be too long, too
- short, too ubiquituous, in foreign characters etc.). A free-form text field with a reasonable length limit is a good
- approach here.
-2. Do not use credit cards or phone numbers to identify people. There are many people who do not have either, and
- scammers can simply buy this data in bulk on the darknet.
-3. Allow people to create multiple identites [#accountswitchopsec]_, and acknowledge the role of social relationships in
- your interaction features. People have very legitimate reasons to separate areas of their lifes, and it is not for
- you or your computer to decide who is who to whom. If your thing requires a global search function, re-consider the
- data protection aspects of your system. If you want to encourage social functions in the face of bots and trolls,
- make it easy for people to share their identities out-of-band, such as through a QR code or a copy-and-pasteable
- short link. If you require someone's legal name or address for billing purposes, unify these identities behind the
- scenes if at all and allow them to act as if fully independent in public.
-
-While change of perspective comes with its share of user experience challenges, but also with a promise for a more
-human, more dignified online experience. Perhaps we can find a way to adapt cyberspace to humans, instead of continuing
-trying it the other way around.
-
-.. _astroturfing: https://en.wikipedia.org/wiki/Astroturfing
-.. _Stasi: https://en.wikipedia.org/wiki/Stasi
-
-.. [#cryptocurrency] Pseudo-currencies in that, while they provide some aspects of a regular currency such as ownership
- and transactions, they lack most others. Traditional currencies are backed by states, regulated by central banks
- tasked with maintaining their stability and ultimately provide accountability through law enforcement, courts
- and political elections.
-
-.. [#discriminatory] Discriminatory as in discriminating against minorities, but also as in deciding what is and what is
- not.
-
-.. [#accountswitchopsec] This does mean that you should not actively prevent people from creating multiple accounts. It
- does not necessarily entail building a proper user interface around this practice. If you do the latter, e.g. by
- offering a "switch identity" button or an identiy drop-down menu on a post submission form, you can easily
- encourage slip-ups that might disclose the connection between two identities, and you make it possible for
- someone hacking a single login to learn about this connection as well.
-
-.. [#meatspacefn] Meatspace_ is where people physically are, as opposed to cyberspace
-
-.. _Meatspace: https://dictionary.cambridge.org/dictionary/english/meatspace
-.. _roles: https://en.wikipedia.org/wiki/Role
-.. _personhood: https://en.wikipedia.org/wiki/Personhood