Trusting Functionality

One of the major challenges we face with the design of our new linguistic command-line project is that of trust. As Zittrain mentions in The Future of the Internet, this is really the fundamental problem of generative systems, and also their most valuable asset: the ability for a user to run arbitrary code is simultaneously what gives the personal computer its revolutionary power, but it’s also its greatest vulnerability.

At present, because our project is still in the prototyping stage, we’re opting for freedom of expressiveness and experimentation over security. That means that all the various verbs we write, while written in JavaScript, are always executed with Chrome privieges, meaning that they’re capable of doing whatever they want to the end-user’s computer.

So the particular dilemma that needs to be solved here is: how can an end-user trust that a verb won’t do anything harmful to their data or privacy—be it intentional or accidental—while still providing a low barrier of entry for aspiring authors to write and distribute their own verbs?

We’ve considered some technical options so far. One is the idea of “security manifests” that come with verbs, specifying what a verb is capable of doing. For example, the “email” verb mentioned in my last post could specify in a manifest that it needed access to the user’s email service and their current selection. This information could then be presented to a user when they choose to install a verb. At an implementation level, the code could run in a specially-constructed sandbox to ensure that the verb code never steps outside the bounds prescribed by its manifest. Alternatively, or in addition to this, an object-capabilities subset of JavaScript like Caja can be used. Such mechanisms ensure that untrusted code can only go as far as the end-user lets it—which, unfortunately, also puts a burden on said end-user. While I don’t personally mind having such a burden myself, I know I wouldn’t want to put it on my friends and family.

Digital certificates are another component of potential solutions, but they too have their own problems. While they’re easy for centralized corporations to deal with, they’re problematic for more distributed operations, and the monetary cost involved in obtaining one significantly increases the barrier to entry for individual software authors. And even signed code doesn’t prevent the more privacy-invasive—but not outright malicious—classes of software like spyware.

As I’ve indicated earlier, this general issue of trust isn’t a new problem, or even just an important issue for an experimental Firefox addon. It’s what Zittrain believes is at the core of the future of the internet and the PC, and the solutions we create—or don’t create—will determine whether their future is one that is sterile or generative. Windows Vista, being the most frequently exploited operating system as a natural result of its widespread use, is the harbinger of a future that relies entirely on technology and corporate trust heirarchies without taking any kind of social mechanisms into account; the result is a notoriously hard-to-use user interface that places an enormous burden on the end-user to constantly make informed security decisions about their computing experience. I don’t think that the answer is merely a “less extreme” version of Vista—which is the model that most other operating systems and extensible applications seem to be following—but rather that a more effective solution is primarily a social one that is supported by technological tools.

I have a particular solution in mind that I’ll be writing about soon. That said, I’d still love to hear any thoughts that anyone has on this topic.

18 Replies to “Trusting Functionality”

  1. I’ve always wondered abotu the feasability of a web of trust style system. http://en.wikipedia.org/wiki/Web_of_trust

    I could add certain people that I trust, similar to installing addons from non-mozilla sites.

    This would also open up a competitive marketplace for arbiters of trust. Mozilla would be a trusted intermeidary, Lifehacker.com could be a trusted intermediary, Ubuntu or Sourceforge could enter the market signing open source verbs.

    You can even have micro schemes – last.fm signing verbs which support their service.

    You could also have the inverse, I could trust spybot to tell me who /not/ to trust.

    Payment can come from various sources or be free:
    Free services
    Add supported services
    Developer pays services
    User pays services
    Bundled service

    etc etc

  2. Sam,
    GreaseMonkey hasn’t reached beyond people like us yet. We are the discernign users.

    My mum doesn’t use greasemonkey, and if she did she wouldn’t know how to discern a trustworthy script from an untrustworthy one.

    If we tried to broaden the appeal of GreaseMonkey to these people we would have to start thinking about security and trust in a more structured way.

  3. I think this is the type of problem that social networking has already solved – by using reputations/karma. Unfortunately, this solution requires an external authority (or group or them) to be trusted.

    Anyway, my idea is:

    A user finds a new Verb, on some random website. Given that its a valid Verb, it has the following properties:
    * It has a unique URL (see http://azarask.in/blog/post/sharing-streamable-functionality/)
    * It has a unique signature – a hash of the file contents (eg, SHA1). This signature works as a version number. The Date-Modified HTTP header can’t be used because its very easily faked (it isn’t secure, but updates can still be pulled based on it).

    The browser discovers this new Verb the user wishes to use, and sends a request to an authority server (or group of them), giving the URL and signature of the Verb. The server returns a reputation score, for that specific version of the Verb.

    Before the user can use the Verb, he/she is told its reputation – and can accept or deny the use of the Verb. A Verb can also be black-listed if its known to be malicious. Once the user has used the verb, he/she is offered the chance to give it their own rating, which will contribute to the reputation to all authority servers the browser knows about. Special testers (who are known to be trusted – like AMO editors) can also test Verbs, and submit their test results to authority servers, upping the reputation significantly (or adding a “tested” flag).

    This system sort-of works like AMO, except Verbs are hosted externally (and don’t even need to be listed). So while the user can get trust-related feedback, it doesn’t take several months for a Verb to be publicly listed. The only thing that’s centralised is reputation (and even that can be distributed among different authority servers). This type of system also has the lowest-possible barrier for the Verb author – the only thing they need to do is put it online somewhere (it need not even be public).

    I’d appreciate any feedback/criticisms anyone has on my above comments.

  4. Blair,

    I see a couple of issues with that;

    1) It feels easy to game, hire a bunch of click monkeys in India and you could game the score of a malicious sript.

    2) Normal people rating? Based on what?

    It would take /allot/ of education to get people to rate based on security / actual quality rather than how pretty the software is. All that ebuddy/Gator software would be rated highly if it was taken as a concensus of the people who downloaded it.

    Another issue with any system is how you handle inherited trust. Does every new version start as untrusted? This seems a bit harsh, and would discourage rapid release schedules.

    Is it the author who earns the trust? Again, potentially problematic and could be gamed. Release a good piece of software, then make your next piece malicious. (Far fetched? various companies have suddenly added dubious adware to previously good software)

    Not that it solves this problem, but an interesting appraoch is the macro architecture of tiddlywiki. A macro is a tiddly, which can be imported from the hosting wiki to your own. The tiddly can even be linked to the source so that updates are automatically ported across.

  5. It would be easy enough to use a sandbox software to give almost no system privileges to whatever software you are using, for example, Firefox. This way, it’s nearly impossible to have a black hat command do damage to your system. I’m not aware of many vulnerabilities with sandboxing, so perhaps that would help?

  6. I look at reputation ratings as being the common users view of how good the verb and it’s underlying code does it’s job. Basically mass rating/ranking.

    From a security stand point, how does Open ID work or Thawte or Verisign for that matter. We as customers trust these authentication providers to provide validation of who or what a system claims they are and certifies the authentication between ourselves and a 3rd party.

    There must be security frameworks/methods out that there that can be used to provide necessary protection.

  7. Well there’s a problem with sandboxing. I mean i WANT to trust certain sources. The first thing is did was a save-to-web, which works on a list of sites that i own. Now whenever i use edit i upload the tweak to the site over a ssh pipe. I use this for minor tweaking tough, i had a older system that did same but the command line works better for my very varied needs, and it is simpler to maintain, originally i just wanted this as a grease monkey script.

    But see I cannot do it. Nothing hard buts still this is the perfect platform just side by side with greasemonkey that cant do it because it does not trust me myself. Sure i can change greasemonkey but then i need to maintain it at all times (that’s what i did).

    So the point is what general sand boxing does is it kills usefulness. Id rather see a .net like sandbox. Where the author defines a quite elaborate list of what this thing needs. And then granted those rights and need a full review if you change your bounds.

    You might wonder if this is a valid use case. Well i worked for a university and there were lots of pages. The most current grumble was that there was typo here and incorrectness there. But see, problem was you couldn’t correct it where you were you needed to page out from the browser start other apps find the spot etc etc. this led to ill do it later. Guess two times if they did.

  8. I think the option of using a capability model would be the most flexible, but there remains the problem of how would the user decide was is safe, at install-time?

    During installation, the app being installed would have to request some capabilities, such as:
    “list and focus tabs”
    “read browser history”
    “exchange data with http://service.com
    “read and modify the current selection”
    “read and modify the entire page”.

    The question is how granular the exposed capabilities would be, and would end-users be able to handle those decisions?

  9. The bottom line here is that no system (which interacts with the outside world) is completely safe. The question, then, is: how do we deal with compromises and security breaches when they inevitably happen? This question should inform the design process. What can we do to combat the attacks when they come? One approach could be to heuristically scan new verbs against an evolving database of malicious exploits. Then a warning dialog could inform the user of the risk. A built-in reporting system would be helpful too. The Herd is the logical starting point for such a system. What about including some kind of a ‘panic button’ for the user to send back information if things go wrong? Could such a feature be incorporated into Safe Mode? And could Safe Mode be made a bit easier for the average user to access? Perhaps there could be a system of restore points so that a user could roll back to the last good configuration?

  10. I think it’s a good idea to implement multiple paradigms all with social trust infrastructure. I envision a system where by you can have a network of trusted or untrusted peers, all with varying levels of trust. For instance my friend Mike is a major js developer, so I would trust every script that he trusts and everyone who he trusts. I would trust nothing just because my grandmother trusts it, but I would still be her peer and suggest scripts to her. I would trust all scripts published by most of my friends, but not all, and I would trust many of their friends. I think it would also be a good idea to allow non-developers to comment on and rank individual segments of code, such that if someone who I trust is good with JS and ranks a specific function in one script as low security then I wouldn’t trust that script, but if someone I don’t know at all ranks the same function I wouldn’t take it into consideration nearly as much.

  11. As to the sandboxing and granular privileges, I think that is a necessary ingredient. Otherwise, a barrier will be created against new developers once a few exploits happen. A new developer benefits from the ability of users to try their code without having to take much risk.

    But reputation and trust is crucial too. I believe the right model is a *transitive* one. Reputation is calculated from the point of view of the end user. They trust a certain set of third parties (certain groups and vendors, their friends, certain experts). These entities trust others and so on. If there is a path composed of high trust links, then the end node is trusted. Otherwise, it is not (regardless of how many people may have downloaded the software or rated it).

    Such a web of trust is an ecosystem. Non-leaf nodes in the ecosystem will need a persistent presence. The trust links will have to be standardized, e.g. with something similar to microformats and/or some form of a query system. The ecosystem will be distributed – it should *not* be controlled by a few players. If it is controlled by one or a few, global adoption is impossible.

    Different types of users can have varying sophistication in their trust UI. An end-user might have a default set that comes with Firefox or Ubiquity, to which they can add additional ones. They will just say “I trust X”. A sophisticated user can rate on multiple dimensions with varying weights – e.g. “I trust Ubuntu regarding stability with weight 0.8”, “I trust Gatorsoft regarding privacy with weight 0.2”.

    This is a project I wanted to take on for a few years now.

  12. When considering the balance of speed between new code, attacks and security warnings and patches, we should consciously slow down the mass-deployment of new code to mainstream users by a few days or weeks.

    An example for this process are many Linux Distributions (take Debian as an example with unstable testing stable and security repositories). Of course, individual verbs are much simpler and independent of each other, so the turn over can be much faster than in Debian.

    But a short delay of keeping new code in a “testing/alpha/non-confirmed” category gives the community an upper hand against bugs and attackers:

    At one hand, an exploit must first pass through a more critical community of more educated testers, with the risk of being detected. An automatic index (aka verb search engine) could focus attention on verbs that request high capabilities and further add risk for an attacker.

    At the other hand, any slowdown in an attack reduces the immediate reward and practicability of many attacks: “Hit-and-run” is a lot easier than “Hit-wait-hide-and-run”.

  13. Remember that trust isn’t necessarily binary. You can 1:not know anything about the script, 2:trust the script, or 3:actively DISTRUST the script.

    If someone you trust as marks a script as untrustworthy, it should count much more than a “trust” vote.

    Userscripts.org is binary, for instance. You either assume a script is trustworthy or actively mark it “spam/harmful/vague”.

Comments are closed.