summaryrefslogtreecommitdiff
path: root/content/posts/sybil-resistance-identity
diff options
context:
space:
mode:
authorjaseg <git-bigdata-wsl-arch@jaseg.de>2020-09-09 22:49:38 +0200
committerjaseg <git-bigdata-wsl-arch@jaseg.de>2020-09-09 22:49:38 +0200
commite13c0259dd94d22eebb6890953084e21103b5cca (patch)
treec500c29199bcdbaabfc064835d5227d744ff0ba4 /content/posts/sybil-resistance-identity
parent7d906100d3efab9061e287f1abe852a9ad0bd53c (diff)
downloadblog-e13c0259dd94d22eebb6890953084e21103b5cca.tar.gz
blog-e13c0259dd94d22eebb6890953084e21103b5cca.tar.bz2
blog-e13c0259dd94d22eebb6890953084e21103b5cca.zip
Add sybil identity draft
Diffstat (limited to 'content/posts/sybil-resistance-identity')
-rwxr-xr-xcontent/posts/sybil-resistance-identity/images/succulents.jpgbin0 -> 587685 bytes
-rw-r--r--content/posts/sybil-resistance-identity/index.rst192
2 files changed, 192 insertions, 0 deletions
diff --git a/content/posts/sybil-resistance-identity/images/succulents.jpg b/content/posts/sybil-resistance-identity/images/succulents.jpg
new file mode 100755
index 0000000..938bffd
--- /dev/null
+++ b/content/posts/sybil-resistance-identity/images/succulents.jpg
Binary files differ
diff --git a/content/posts/sybil-resistance-identity/index.rst b/content/posts/sybil-resistance-identity/index.rst
new file mode 100644
index 0000000..869a782
--- /dev/null
+++ b/content/posts/sybil-resistance-identity/index.rst
@@ -0,0 +1,192 @@
+---
+title: "Sybil Resistance and Digital Identity"
+date: 2020-09-09T15:00:00+02:00
+---
+
+.. raw:: html
+
+ <figure class="header">
+ <img src="images/succulents.jpg">
+ <figcaption>Photo by <a href="https://unsplash.com/@timbennettcreative">Tim Bennett</a> on <a href="https://unsplash.com/">Unsplash</a></figcaption>
+ </figure>
+
+
+Sybil in Cyberspace
+===================
+
+In informatics, the term *distributed system* is used to describe the aggregate behavior of a complex network made up of
+individual computers. For decades, computer scientists to some success have been trying to figure out how exactly the
+individual computers that make up such a distributed system need to be programmed for the resulting amalgamation to
+behave in a predictable, maybe even a desirable way. Though seemingly simple on its surface, this problem has a
+surprising depth to it that has yielded research questions for a whole field for several decades now. One particular
+as-of-yet unsolved problem is resistance against so-called *sybil attacks*. Named after the 1973 book by Flora Rheta
+Schreiber on dissociative identity disorder, in distributed systems a sybil attack is an attack where one computer
+acts to the rest of the network as if it were multiple, independent systems. The core insight is that there cannot be
+any technological way of preventing such an attack, and any practical countermeasure must be grounded in some authority
+or ground truth that is external to the systems—bridging from technology to its social or political context.
+
+Looking around, we can see a parallel between this question ("which computer is a real computer?") and a social issue
+that recently has been growing in importance: Just like computers can pretend to be other computers, they can also
+pretend to be humans. As can humans. Be it within the context of election manipulation or down-to-earth astroturfing_
+the recurring issue is that in todays online communities, it is hard for an individual to tell who of their online
+acquaintances are who they seem to be. Different platforms attempt different solutions to this problem, and all fail in
+some way or another. Facebook employs good old snitching, turning people against each other and asking them "Do you know
+this person?". Twitter is more laid-back and instead of such Stasi_ methodology simply opts to require a working mobile
+phone number from its subjects, essentially short-circuiting identity verification to the phone company's check of their
+subscriber's national passport.
+
+Trusting Crypto-Anarchist Authorities
+=====================================
+
+Beyond these centralistic solutions to the problem, crypto-anarchists and anarcho-capitalists have been brewing some
+interesting novel approaches to this issue based on *blockchain* distributed ledger technology. Distributed ledgers,
+often colloquially called "blockchains", are a distributed systems design pattern that yields a system that works like
+an append-only logbook. Participants with the right permissions can create new entries in this logbook, but
+noone—neither the original author, nor other participants—can retroactively change a logbook entry once it has been
+committed to the log. In the blockchain model, past entries are essentially written into stone. This near-perfect
+immutability is the property that opens them for a number of use cases from cryptographic pseudo-currencies
+[#cryptocurrency]_.
+
+An overview over a variety of these unconventional blockchain identity verification approaches can be found in `this
+unpublished 2020 survey by Siddarth, Ivliev, Siri and Berman <https://arxiv.org/ftp/arxiv/papers/2008/2008.05300.pdf>`_.
+They walk their readers through a number of different projects that try to solve the question "Is this human who they
+pretend to be?" using joint socio-technological approaches. In the following few sections, you may find a short outline
+of a small selection of them. The conlusion of this post will be a commentary on these approaches, and on the underlying
+problem of identity in a digital world.
+
+.. BrightID
+
+In one scheme, identity is determined by "notary" computers that aggregate large amounts of information on a user's
+social contacts. These computers then run an algorithm derived from the SybilGuard_, SybilLimit_ and SybilInfer_ lineage
+of random-walk based algorithms. These algorithms assume that authentic social graphs are small world graphs: Everyone
+knows everyone else through a friend's friend's friend. They also assume that there is an upper bound on how many
+connections with authentic users an attacker can forge: Anyone who is not embedded into the graph well enough is cut
+out. Disregarding the catastrophic privacy issues of storing large amounts of data on social relationships on someone
+else's computer, this second assumption is where this model unfortunately breaks down. Applying common sense, it is
+completely realistic for an attacker to forge a large number of social connections: This is precisely what most of
+social media marketing is about! A more malicious angle on this would be to consider how in meatspace [#meatspacefn]_
+multi-level marketing schemes are successful in coaxing people to abuse their social graphs to disastrous consequences
+to the well-being of themselves and others. Similar schemes would certainly be possible in cyberspace as well.
+
+An additional point to consider is that the upper limit SybilGuard_ and others place on the number of fake identities
+one can have is simply not that strict at all. An attacker could still get away with a reasonable number of false
+identities before getting caught by any such algorithm.
+
+.. Duniter
+
+In another scheme, identity is awarded to anyone who can convince several people already in the network to vouch for
+them, and who is at most a few degrees removed from one of several pre-determined celebrities. Apart from again being
+vulnerable to conmen and other scammers, this system has the glaring flaw of roundly refusing to recognize any person
+who is not willing or able to engage with multiple of its members. Along with the system's informal requirement for
+members to only vouch for people they have physically met this leads to a nonstarter in a cyberspace that grown
+specifically *because* it transcends national borders and physical distance.
+
+.. Idena Network
+
+The last scheme I will outline in this post is based around a set of `Turing tests`_, that is, quizzes that are designed
+to tell apart man and machine. In this system, all participants have to simultaneously undergo a Turing test once in a
+fortnight. The system uses a particular type of picture classification-based Turing test and does not seem to be
+designed with the blind or mentally disabled in mind with accessibility concerns nowhere to be found in the so-called
+"manifesto" published by its creators. But even ignoring that, the system obviously fails at an even more basic level:
+The idea that everyone takes a Turing test at the same time only works in a world without time zones. Or jobs for that
+matter. Also, it assumes that an attacker cannot simply hire a small army of people someplace else to fool the system.
+
+.. _SybilLimit: https://www.comp.nus.edu.sg/~yuhf/yuh-sybillimit.pdf
+.. _SybilGuard: http://www.math.cmu.edu/~adf/research/SybilGuard.pdf
+.. _SybilInfer: https://www.princeton.edu/~pmittal/publications/sybilinfer-ndss09.pdf
+.. _`Turing Tests`: https://en.wikipedia.org/wiki/Turing_test
+
+Identity between Cyberspace and Meatspace
+=========================================
+
+A common thread in all of these solutions, be it the Facebook'esque Stasi_ methods or the crypto-anarchist
+challenge-response utopias, is that they all approach digital identity as a question of Objective Truth™ that can
+unanimously be decided at a system level—or that can be externalized to the next larger system such as the state. Alas,
+the important question remains unasked:
+
+ What *is* identity?
+
+Departing from all the systems outlined above, I want to make a suggestion on how we can approach this topic in a more
+practical, less discriminatory [#discriminatory]_ manner. I think both using people's social connections and proxying
+the decisions of external authorities such as the state are bad systems to decide who is a person and who is not. Let us
+now illustrate this point a bit. Let us think about how many digital identities a human beign might have. Let us first
+consider the case of n=0, someone who simply wants no business with the system at all. For simplicity, let us assume
+that we have solved this issue of consent, i.e. every person who is identified by the system consents to this practice.
+For n=1, the approaches outlined above all provide some approximate solution. States may not grant every human
+sufficient ID (e.g. children, mentally disabled or prisoners might be left out), and the social systems might fail to
+catch people who simply do not have any friends, but otherwise their approximations hold. Maybe. But what about n=2,
+n=3, ...? None of these systems adequately consider cases where a human being might legitimately wish to hold multiple
+identities, non-maliciously.
+
+Consider the case of a lesbian, conservative politician. An active social media presence is a core component of a modern
+politician's carreer. At the same time, "conservative homophobe" is still well within the realm of tautology and it
+would be legitimate for this politician to wish to not disclose this aspect of their private life to the world at large,
+and have a separate online identity for matters related to it. For this politician, the social relationship-based
+systems referenced above would either have outing them as a design feature, or they would force them to choose either of
+these identities: Requiring them to choose between private life and carreer. When deferring to the state as the decider
+over personhood, at least the platform's operator would know about the outrageously sensitive link between the
+politician's online identities. Clearly, none of these systems are socially just.
+
+Let us try not to be caught up on saving the world at this point. The issue of conservative homophobia is out of the
+scope of our consideration, and it is not one that anyone can solve in the near future. Least of all can true change be
+forced through contracts, legislation or other rules. There is a case for legitimate uses of multiple, separate digital
+identities, and we do not have a technical or political answer to it. All hope is not lost yet, though. We can easily
+undo this gordian knot by acknowledging an unspoken assumption that underlies any social relationships between real
+people, past the procrustean bed of computer systems or organizational structures these relationships are cast into.
+
+ Identity is subjective. Identity arises from a relationship between people, and the same person might legitimately
+ have multiple identities to different people.
+
+Thinking beyond the straw man politician above, this is evident in more subtle ways in almost all our everyday
+relationships: Some people may know me by my legal name, some by my online nickname. To some I may be a computer
+scientist, to some a flatmate. None of my friends and acquaintances have ever wanted to see my passport, or asked to
+take my DNA to ascertain that I am in fact a differnet human than the others they know. It would simply be exceedingly
+weird for someone I know to snoop around the other people I know, trying to build a map of where these people know me
+from and whether they think the same about me. Yet, this concept of a consistent, global identity is exactly what up to
+now all technological solutions to the identity problem are about.
+
+Building Bridges
+================
+
+I think I can offer you one main take-aways from the discussion above.
+
+ Focus on relationships, not identity.
+
+Rephrased into more actionable points, as someone designing a digital system, do the following:
+
+1. Allow people to chose their own identifier. Don't require them to use their real names, they may not wish to
+ disclose those or they may not be in a format that is useful to you (they may be too long, too short, too
+ ubiquituous, in foreign characters etc.). A free-form text field with a reasonable length limit is a good
+ approach here.
+2. Do not use credit cards or phone numbers to identify people. There are many people who do not have either, and
+ scammers can simply buy this data in bulk on the darknet.
+3. Allow people to create multiple accounts [#accountswitchopsec]_, and acknowledge the role of social relationships in
+ your interaction features. People have very legitimate reasons to separate areas of their lifes, and it is not for
+ you or your computer to decide who is who to whom. If your thing requires a global search function, re-consider the
+ data protection aspects of your system. If you want to encourage social functions in the face of bots and trolls,
+ make it easy for people to share their identities out-of-band, such as through a QR code or a copy-and-pasteable
+ short link.
+
+While change of perspective comes with its share of user experience challenges, but also with a promise for a more
+human, more dignified online experience. Perhaps we can find a way to adapt cyberspace to humans, instead of continuing
+trying it the other way around.
+
+.. _astroturfing: https://en.wikipedia.org/wiki/Astroturfing
+.. _Stasi: https://en.wikipedia.org/wiki/Stasi
+
+.. [#cryptocurrency] Pseudo-currencies in that while they provide some aspects of a regular currency such as ownership and
+ transactions, they lack most others. Traditional currencies are backed by states, regulated by central banks
+ tasked with maintaining their stability and ultimately provide accountability through law enforcement, courts and
+ political elections.
+
+.. [#discriminatory] Discriminatory as in discriminating against minorities, but also as in deciding what is and what is not.
+
+.. [#accountswitchopsec] This does mean that you should not actively prevent people from creating multiple accounts. It
+ does not necessarily entail building a proper user interface around this practice. If you do the latter, e.g. by
+ offering a "switch identity" button or an identiy drop-down menu on a post submission form, you can easily
+ encourage slip-ups that might disclose the connection between two identities, and you make it possible for
+ someone hacking a single login to learn about this connection as well.
+
+.. [#meatspacefn] Meatspace_ is where people physically are, as opposed to cyberspace
+
+.. _Meatspace: https://dictionary.cambridge.org/dictionary/english/meatspace