Managing Authorisation and Authentication UIs in a Wayland-Based Linux

1. Introduction

After Martin published his article on the security on Wayland, we received plenty of feedback, and among it emerged a discussion on the difficulty of preventing the spoofing of authentication and authorisation dialogs (the former often being used as a by-product for the latter). Such dialogs appear either when you require a privilege escalation (gksu-like) or access to a restricted/privileged interface controlled by the compositor/desktop environment. In the system we envision, applications have restricted privileges and some are awarded special ones (such as the ability to record the screen, receive special keyboard input, etc.). When an app needs a privilege it does not naturally have, it must ask for it through an authorisation protocol. Besides, we also need to provide a means of authentication that resists spoofing, for the few cases where authentication remains necessary. In this article, I explore the threat model, security requirements and design options for usable and secure authorisation and authentication on modern Linux.

Errata: this article is not about when to use authorisation, but about how to design it. I perfectly concur to the view that the best permission request is the one that does not involve disturbing the user! The ideas discussed here apply for those few edge cases where we may not be able to design authorisation requests away (updated on 2014-03-28).

2. State of the Art

Linux

Authentication: Most of the time, a user will be asked to authenticate by a polkit authentication agent, when trying to perform an operation that requires another account’s credentials. Polkit is quite paradoxal to me, being described as providing an “authorisation API” yet only knowing how to ask users to authenticate rather than authorise. Other forms of authentication include graphical front-ends to su (KdeSu and gksu) which allow running commands with a different identity than one already has.

Authorisation: At the moment, very few situations trigger proper authorisation dialogs on Linux systems. polkitd seems to be the authorisation API of choice, and it maps requested privileged operations a user’s UNIX permissions and a system-wide policy. Hence, polkitd would either directly authorise an operation, or it will ask the user to authenticate as someone else who has the requested privileges. With Martin’s proposal on Wayland security, we seek to introduce some forms of capabilities in userland which would create process authorisation use cases, though.

Here are some examples of commonly-faced authentication dialogs on Linux nowadays (and the only authorisation dialog I could find). As far as I could bother to check, BSD flavours also use polkitd.

Microsoft Windows

Windows uses a single interface for authorisation since Vista, which is named User Account Control (UAC). Despite its bad reputation and a couple of glitches and security flaws, a bit of scrutiny into the issues surrounding privilege authorisations convinced me that Microsoft’s decision to setup UAC is pretty good (the implementation / security benefits not so, as pointed out here or there – read the comments too). I’m hoping that we can do even better than them by learning from the attacks on UAC and from the reasons that push Windows users to disable it.

UAC is an API that applications must use in order to declare which privileges they require or to ask for extra privileges. Applications start with relatively low privileges, even when the user running them is an administrator. Only some apps signed by Microsoft itself can run on administrator privileges by default. Most apps run with “medium” privileges and they can even drop some if they feel they are at risk of being exploited (hence reducing the impact they can have on others). This is commonly done by Web browsers.

When apps require a user intervention to acquire a new privilege or to run, the user is either prompted to authorise the request or to authenticate as an administrator with sufficient permissions (note also the configuration option that requires an administrator to re-authenticate before the authorisation is granted). The interface for all these requests consists of a dim background being applied to the current desktop, and a modal dialog – completely isolated and protected from other applications – appearing and waiting for the user to input her information/decision. The dialog presents information on the identity of the app (sometimes as little as the executable name, which is not very informative) and the name of its publisher (with different decorations and icons being used to emphasize whether the user should trust that publisher information or not, from signed Microsoft software to unknown publishers’ software).

The topic of this post is Linux so I won’t discuss this further, but note that any app can apparently imitate this UAC dialog’s look (example with KeePass).

Apple OS X

Albeit very interesting, OS X’s authorisation system is not discussed because of the outrageous terms of use of their website, covering the documentation I wanted to cite (see point 3.b of this document).

Let us just note the following:

apps need to get the user to re-authenticate through a third-party daemon to perform privileged operations
apps can check whether the user will need to authorise them for certain operations, so they can reflect their lack of authority in their GUI (useful for system settings)
spoofing attacks on the content of the authorisation dialog are (or were at least in the past) greatly facilitated by the API

3. Privileges in a Wayland-Based Linux

Martin has presented a list of restricted privileged interfaces that could require an application to request an authorisation in his Wayland article. My list is a bit different because I’d include privileges unrelated to Wayland / windowed applications. The reason is because consistency of UIs is a very important element of the security mental model of the user: if they are used to seeing the same UI all the time, they will be more suspicious of spoofing attacks (cf. Section 4). Some typical privileges that require authorisation:

Screenshot taking and screen recording
Screen sharing (VPN, Skype) – possibly identical to the one above
Virtual keyboards and pointing devices
Audio (microphone, amplifier) and video (webcam, CCTV) capture devices
Binding to specific keyboard shortcuts one does not own
Clipboard management
Access to data in app-specific password stores in the user’s keyring
Ability to run without a GUI (off-topic, coming in a later article)

I don’t include interfaces that actually require authentication rather than just authorisation, though I’ll try to discuss the cohabitation between the two mechanisms. These interfaces show the need for both a within-session authentication UI (e.g., gksu) and a cross-sessions authentication UI (the DM’s greeter). Generally speaking, one may either need to provide credentials for an administrator’s account, the root account, or her own account (only when it’s legitimate to include physical adversaries in the threat model):

Installing applications (auth as admin)
Managing users (auth as admin)
Configuring system-wide settings (auth as admin)
Changing one’s own password (requires re-auth, security best practice)
Session locking (requires re-auth from within the greeter)

To sum up, we want to manage three categories of interactions: a user authorising a process, a user authenticating as someone else to borrow their privileges, and a user re-authenticating. It’s important to come up with a single way to graphically do all those things, because consistency is a key to making users differentiate the system’s UI from those of malicious applications. I would love to see a system presenting the following simple motto to the user: “We will ask you to type your own password only when you change it. We will ask you to type an administrator’s password only when you manage system-wide settings.”

Errata: I would like to clarify that privilege granting, in my view, should be done through three sequential processes and not systematically through authorisation UIs:

default system-wide lists of privileged apps (maintained by DEs and by distributors), which can be customised by users
evidence of user intent (e.g., Security by Designation or User-Driven AC)
when none of the above works, authorisation prompts (to provide a situated way to manage security for untrusted apps or uncommon use cases)

(updated on 2014-03-28).

4. Threat Model

A normal threat model would include a justification why certain adversaries exist and a clear view of their capabilities. When designing general-purpose operating systems, we can only consider general adversaries with general capabilities. We need to decide what adversaries we design against and which are not our responsibility’s but that of the people deploying our system.

Here, we consider any adversary able to remotely execute code with the user’s privileges. For instance, an application may turn out to be malicious, or it may be partially or entirely controlled by an adversary through some crafted input fed into it by the user. An adversary may be very interested in either of obtaining restricted privileges (for whatever reason) or stealing authentication credentials typically used to grant such privileges.

Snooping/Spying on authentication dialogs (theft of credentials)

Very very few people are aware of this, but the current display server X11 does not provide any isolation between the input events of various applications (which in itself is a sufficient argument in favour of the development and adoption of Wayland). You can easily find tutorials on how to snoop passwords from other windowed applications including su/sudo utilities inside a Terminal app. This allows stealing credentials from any authentication dialog that the user runs. As explained by Martin, this will be prevented by design in Wayland compositors.

Injecting input into auth dialogs (theft of privileges)

Another thing that is possible with X11 is to inject keyboard/mouse input into other windowed applications (example). Attacks to utilities like gksu or kdesu are very easy to perform and can be sophisticated to the point of being barely noticeable by attentive users.

One may for instance perform a timing attack to inject their own binary name instead of the one typed in by the user. This can be slightly mitigated by displaying the full path of the command to be executed and letting the user read it before authenticating. It can also be entirely mitigated by not letting unprivileged apps inject keyboard events into others’ windows.

Attackers may also invoke any authorisation API themselves and inject mouse events to click on the “Authorise” button of the authorisation dialog on behalf of the user. There are some hacks to protect against this such as randomizing the starting position of the mouse cursor, dialog and dialog contents, etc. However, the only one proper solution to this problem is making sure that no unprivileged application can inject mouse events.

I wrote a simple proof-of-concept that injects a prefix to the path of a command when invoking gksu. To use it, you need to time it so that the events are inserted after the user typed the command and before they type Return, leading to the execution of malicious /tmp/myscript.sh rather than benign /usr/bin/myscript.sh. Note that this is not a gksu vulnerability but a X11 one. If the user called gksu myscript.sh instead, I’d just need to move the cursor in between gksu and its argument and then inject the prefix that runs my own malware. If I don’t know the name of the invoked binary, I could replace it rather than prepend it.

These attacks are also prevented by design in Wayland.

Confused deputy (theft of privileges)

I’m just giving some examples on Windows Vista and 7 here because it’s a bit of a larger issue than the graphic stack’s role in UI design.

Windows 7 UAC: the user could feed their own library to a system utility, which was white-listed for UAC; The library could then perform authorisation queries with the identity of that system utility and skip authentication (link)
Windows Vista UIPI: apps can list windows currently open to find which privileged processes to contact for confused deputy attacks (link)

Conclusion: we must keep track of which apps possess privileges and hide them from unprivileged apps, in a systematic way. The matter is not discussed in this article, but comments and ideas are very welcome.

Spoofing dialog UIs (theft of credentials)

This is in my view the hardest attack to prevent. Spoofing occurs when a third-party application imitates the appearance of an auth dialog in order to cause the user to interact with it as they would with the real dialog. If you spoof an authorisation dialog, then you will obtain nothing as the user “authorising” your request on a spoofed dialog will not lead you to receiving the corresponding privilege from the system. Authorisation spoofing is at best annoying noise for the user, nothing that concerns the Wayland protocol.

However, authentication spoofing has very dramatic consequences: if a spoof gets the user to type in their real credentials, those can be used to log into the user’s session or even elsewhere (because of our propensity to reuse credentials whenever possible). Protecting against spoofing is only possible by crafting a UI that cannot be entirely imited (I’ll present Martin’s ideas on that below).

However, what really matters is the ability of the user to systematically and immediately distinguish any fake from the true UI and to associate the fake with a strong feeling of insecurity. Otherwise, spoofing may very well still occur. The solution to this is to have the authentication dialog authenticate itself to the user by presenting a secret/credential shared with the user (thanks to GaMa for inspiring this requirement). The secret should be used for that purpose, and not be one that is used to authenticate the user as this would allow shoulder surfing from physical adversaries. This means we need a way to generate such a secret when an account is created, to update or modify it, and to store it securely.

Other attacks

Spoofing session greeters: Wayland should impose restrictions on the capabilities of unprivileged applications to leave some design space for greeter designers to make their UIs distinguishable from normal apps’ windows. For instance, unprivileged fullscreen windows shouldn’t be modal, and greeters could be let to display authentication dialog secrets to users. Any interfaces related to knowing whether the user is active or inactive or related to (especially automatic) session locking and greeter preferences are good candidates for privileged operations as they would allow an attacker to time the spawning of a fake greeter and prevent the real one from being invoked.

Environmental attacks may also arise: if a distribution allows user-installed locale files, a malicious app may replace the descriptions of authorisations in order to fool the user into believing it is asking for more benign privileges than it actually does. Likewise, some theme engines may give theme designers the opportunity to customise specific fields of a UI that may be used to design a dialog hiding away security details such as the app name and injecting textual content instead (something easily feasible with CSS 3 for instance).

Management of interpreters: When an application is expected to run user-supplied untrusted code, it should not qualify for privilege granting (or only for disposable ones). This concerns interpreters such as Python which can for instance cause the GNOME Keyring to not correctly identify the requesting app (mistaking it for the /usr/bin/python binary). We might need some interpreter-specific black magic or hacks to identify apps within Python and this is well outside my domain knowledge, so I’ll leave this issue aside for now and would welcome any contribution to our design!

App identity spoofing: In a very similar fashion to the interpreter problem, Windows Vista shipped a binary that allowed running Control Panel plugins with a Microsoft-signed utility’s identity (link here), hence preventing users from knowing which app required authorisations. In OS X, spoofing the app’s name, the description of the desired permission and a bunch of other things was also possible at least in 2009, though I didn’t check how reproducible that issue is now. The great flexibility of their permission requesting API surely made it very easy for malware writers to lie to users about their intentions and what it was they were asking for (click here for more). Even better, some UIs don’t even attempt to show the app’s identity and just leave the user clueless.

Linking to or injecting code into privileged apps: What worries me more is when a genuine app that is privileged by default (e.g., your virtual keyboard software) or can acquire privileges through the user (a Skype call in which you temporarily authorise screen sharing) is exploited into running malicious code. There are a number of obvious techniques for that such as LD_PRELOAD code injections that would trigger some malicious code in genuine authorised applications (examples here and here), or hooking into a running privileged program and injecting code using ptrace. These attacks are very tough to defend against and will be examined in a future article (probably featuring PID namespaces).

5. Security Requirements

The identified attacks already give us an idea of what requirements must be used to design appropriate auth UIs:

Unprivileged applications should not be let to read/modify the input received by other applications’ windows
Applications holding any kind of privileges must be protected from all forms of code injection / debugging.
Applications should receive privileges (and more so be privileged by default) only if they cannot be invoked / controlled by other unprivileged ones
Interpretors of any kind should never receive a privilege, unless the piece of code being interpreted can be safely identified and the privilege cannot be reused/shared
Authorisations make much more sense in a system where apps are sandboxed and access to file systems is limited. If you can’t take a screenshot but can call the screenshot app and then read the screenshot file, then screenshot-taking privileges are useless
There should be a GUI debugging mode where developers can record the auth UI, perform automatic testing, etc.
The UI should always spawn through a trusted path, with an environment entirely controlled by the compositor; if the user can use a previous environment, it should be emphasized that this is an attack vector, and there should be no way for any third-party to enable this option prior to the UI being called
Operations leading to authorisations should be documented and limited, convey a clear meaning; Do not allow custom authorisations (else who would verify the description of the authorisation is clear to the user?)
Apps must be identified clearly by the compositor (names taken from .desktop files in /usr, absolutely never from something modifiable by a user-run process)
Authentication dialogs should authenticate themselves in a way very obvious and non-time-consuming for the user
In an ideal world, there would be one window per process and the user would know which window rather than which process is authorised to do something (the explanation behind this one is quite off-topic and will come in a later article)
In an utopian world, the user knows which data can be affected by an authorisation (e.g., whether their bank website currently on screen will appear on a screenshot, which files’ content can be leaked to an app, etc.) so s/he can make a `blink of an eye’ decision; the effects of authorisations should be tangible

6. Authorisation UIs

Because I’m not so convinced that we’ve yet found a UX that makes spoofing untractable by design, I believe it’s important to separate authentication from authorisation so that spoofing does not compromise valuable tokens (i.e., authentication credentials). Authentication has, for long, been used as a proxy for authorisation on information systems, assuming maybe that with the all-too-flexible APIs an app can use to impersonate the user who runs it, asking a user for a secret was the only way to distinguish her from the app. Since we’re speaking about window isolation in Wayland then we can finally start to put some trust on GUI interactions with the user conveying an authentic rather than fabricated meaning. Hence, GUI operations may become a viable proxy for authorisation tokens. An authorisation token is typically a one-time use object generated by a trusted authority (the compositor) and used by the system controlling access to privileged interfaces (the WSM). Such tokens can be distributed by having the user interact with an authorisation UI controlled by the compositor.

Asking for privileges

Essentially, authorisation UIs require that a user receives information about a request (the identity of the requester and what is being asked for) and makes a decision (a “Authorise” and a “Deny” button, or variants in formulation). Additional information can be given such as the history of authorisations for the requester, the duration of the authorisation or whether the system has any trust in the application (if possible). Anything provided by the app itself should be left out of the UI as attackers will make sure to exploit it – typically one should not let the app explain why the authorisation is being requested, as users’ decisions are influenced by such information. Besides, it is a well known fact among HCI practitioners that people get habituated to computer prompts and tend to ignore their contents when they are frequent enough. Security prompts often offer no benefit for the fulfillment of users’ primary task and so are just treated as an unavoidable disturbance. There often is no noticeable immediate consequence to a wrong security decision, and so users will be more likely to authorise systematically than if they could monitor how the authorisation is being used. Hence, I do not assume that the user does realise:

which app is asking for a privilege
what privilege is being requested
how long it will be granted for

I’m interested in strengthening the basic authorisation dialog so as to obtain stronger evidence that the above properties hold. When it comes to the privilege being properly identified in a blink-of-an-eye, I can only think of having a very effective visual representation of each privilege, such as displaying a large icon (possibly animated if it helps to come up with a representation, e.g., data flows) on the dialog. Images are recognised better than words (though not always recalled better especially if hard to label, which means we should provide the label with the image). Recognition is superior probably because they contain richer information than short sequences of words. For the same reason, images can be made highly distinguishable from one another for each privilege and hence help users notice a new privilege and take the time to read its description. Below are some quick and dirty examples of such visualisations (showing as well my design iterations).

Knowing Who’s Asking

As for app identification, I don’t think that displaying a short name or icon prominently is sufficient. This data cannot be trusted especially for applications not installed through one’s distribution repositories. The user should see which running application is requesting a privilege rather than just be given a name. Apps without a window, panel plugin or other GUI element can hardly fulfill this requirement, because users have nothing to hold on to to identify whether that app is running or to shut it down.

Besides, app names and icons identify an application rather than a running instance of it – a specific window or other tangible entity the user can interact with. Tangibility plays an important role in facilitating users’ understanding of a technological phenomena (examples on network infrastructures and on file sharing mechanisms), hence it would be desirable to provide a relationship between the UI and the app, that makes the user feel which application is receiving a privilege. There are such relationships of spacial nature:

Authorisation UI within the app’s window (when a window exists) – zero-step cost
drag and drop or copy and paste an authorization token (made tangible) to the app’s GUI – one-step cost
using techniques like in “Your Attention Please” – zero to many-steps cost

In this model, apps would lose their privileges when their GUI is shut (regardless of whether the underlying process still runs) and be restricted from acquiring new ones. Applications without a GUI could obtain a special privilege (“Performing privileged operations in the background/without telling you”) to bypass this restriction. Below are some examples of authorison icon mockups I made (with one very obvious trademark violation that cannot be used). Ideas and critiques are welcome, quite obviously.

7. Authentication UIs

Authentication is much more sensitive to spoofing than authorisation, as previously explained. Let us review three defence mechanisms we came up with for this task: unspoofable UI, Windows’s secure attention sequence, and UI authentication to the user.

Stuff Only the Compositor Can Do

Martin proposed that an unspoofable UI uses abilities that only the compositor has. For instance, a compositor can modify the position, size and display of all windows. When an authorisation UI is launched, windows that were already open could have a wobbly animation applied to them (until the UI is closed). Some animations are even particularly effective at causing epilepsy attacks! :-) If animations cannot be applied on a system (legacy GPUs, a11y issues, etc.), simple modifications such as an Expose-like display of windows could indicate that the compositor runs the authorisation UI’s code.

The most compelling issue with manipulating only windows is that it requires windows to be open in the first place. Other approaches could include taskbars, systrays or even the desktop wallpaper, knowing that in each case the information to be used must be hidden from all desktop apps the user runs and that it must vary or be routinely customised by users. The idea is to display/transform elements of the desktop that exist regardless of the app requesting an authorisation, and to make sure that a normal app cannot display exactly the same thing. It also matters that the transformation being applied is very consistent, so the user can be habituated to it and notice differences more easily. Indeed, an attacker may try to apply animations with generic windows placed randomly, or a generic task bar, hoping that the user will not pay attention to the information displayed in the background. This is especially true if such a UI is deployed in a system where the DE’s config files and the wallpaper can be read by any application. The attacker may also try to run a simple dialog with no animations/transformations if those are not obvious. In any case, security remains mostly the responsibility of the user.

As far as I’m concerned, I doubt users make the link between the presence of certain visual cues in the background rather than others and the fact that a UI is not a fake but controlled by the compositor. They probably just expect a window to declare who it is run by – system or apps (links to serious surveys/studies on the topic much appreciated), and I would assume that as long as the spoof looks similar enough to the real UI, attacks will work. Let’s not throw the baby with the bath water, though. Such ideas may make it a tiny bit harder to abuse the user, at relatively little development cost. Besides, this measure costs nothing to the user in terms of mandatory extra steps to take in a decision process. This means the usage of this defence mechanism is optional and depends on the user’s willingness to waste time, rather than imposed on her/him.

Applying Windows’ Secure Attention Sequence

Input filtering in Wayland allows us to catch and process specific keyboard key sequences that are not exposed to applications. Windows uses the infamous Ctrl+Alt+Suppr sequence (because it was virtually unused by applications at the time Vista was being developed) prior to displaying an authentication dialog to its users. Indeed, users are expected to notice authentication UI spoofs because these would fail to react to them performing the Ctrl+Alt+Suppr sequence. The name for this sequence is Secure Attention Sequence (SAS later on).

Timothee proposed to recycle this idea in a slightly different way. In his model, rather than the auth UI asking the user to perform the SAS, it is the fact of typing the SAS that would allow the auth UI to spawn and allow the currently focused application to request a privilege to the user. Apps would then have to ask the user to type a SAS in whatever way they prefer, which allows users to do nothing if they’re not willing to authorise the application. This would alleviate some of the exasperation Windows users had with Windows User Account Control, at the expense of some clarity on when the user’s expected to authorise/authenticate.

There are many potential attacks and reasons for confusion here. What should happen when the user presses SAS but no application is requesting privilege? What if an app asks the user to press SAS, and attempts to spawn an authentication dialog before the user does so (listen to Ctrl+Alt sequences to improve your odds)? Would they key in their password? Even weirder is the case where an app spawns a spoof dialog right after a successful authentication with the compositor’s UI: the user would probably consider this a glitch/bug and re-type their password.

An app could also ask the user to press a SAS by giving it a very credible justification, and then ask the system for an entirely different privilege, hoping that the user would not double-check the justification given in the compositor-controlled UI. As I’ve said before, users are quite sensitive to justifications and would probably be less on their guards after they typed in their SAS since they’ve essentially already made the decision to authenticate.

Some apps could even try to get the user to authenticate without even bothering with asking for a SAS to be typed. After all, major Web broswers already use keyrings with custom master passwords, and there probably are a bunch of other applications asking for users to type passwords on a regular basis. I’m actually interested in hearing from developers of such applications’ opinions on replacing their authentication mechanisms with a system-provided per-app keyring that only requires (secure) authorisation. The keyring could store a decryption key for those like Google Chrome who want to synchronise passwords with a third-party server, yet allowing users to use authentication-free keyrings and hence reduce the extent of harmful authentication habituation.

All in all the design sounds interesting but is not without consequences. The main issue for me is that plausible attacks result in credential theft, in a system that does come with a systematic cost to the user. We should only consider SAS mechanisms if we cannot find better for the same interaction cost.

Authenticating to the User

Apart from the aforementioned anti-spoofing measures, we’ve identified one key requirement for secure authentication UIs: they must authenticate themselves to the user. Obviously such a dialog should be modal and protected from any form of recording, including applications with screen recording or sharing privileges. The idea that was originally proposed by GaMa on LinuxFR was to display a secret image chosen by the user at account creation time. The reason I like the idea of an image is that it is easier to recognise than a word or piece of text. Though, one must also consider accessibility issues associated with visual content, and so it should be made possible for a secret to also take the form of a passphrase. Time’s running and so I won’t be making mock-ups now for those dialogs.

A secret could be generated at two different moments of an account’s life: when the account is created in a GUI environment (Live OS installer or account creation from an existing system); or when the user enters the session for the first time (a bit intrusive though). The latter is also necessary should the user’s secret be erased (which may happen when a disk dies, for instance). Distributions could ship a database of ~120 different thumbnails unsimilar to one another, and of course these should be displayed in a completely random order to guarantee diversity between accounts for those users who don’t bother to pick one and just click “Next” (hopefully these will be able to identify and recognise their secret image over time before they get attacked). When there is evidence the user cannot view images (running the high-contrast theme, having checked a box in the installer indicating a11y issues, running a11y software, etc.), these could be replaced by a database of author citations, or could be accompanied by a description to allow switching back and forth between the normal and a11y modes of the desktop environment.

When it comes to storing this secret, we could either harness mandatory access control enforcement systems (SELinux, TOMOYO Linux, etc.), or create private filesystems for each process. Martin thinks it should be feasible with Linux filesystem namespaces (as supported by systemd providing private tmp directories to services). I will be looking into options for process isolation (including FS) in the next few months anyway.

From this I conclude that the main issue with unspoofable authentication UIs is accepting the idea of adding a step to user enrollment on Linux systems, which is not an easy one. However, in a world where we still force authentication after authentication down the user’s throat, I believe the threat of spoofing is too big to be left unaddressed.

8. Conclusion & Acknowledgments

First of all I would like to thank Martin Peres for our lengthy discussions of the threat model and solutions and for coming up with some of the ideas exposed here. Likewise, Timothée Ravier has helped shape part of this article, and pseudonymous linuxfr contributor GaMa has hinted a very useful design idea for authentication UIs.

In this paper, I’ve discussed common attacks against auth UIs, summarized the needs and security/usability requirements for the tasks of authorisation and authentication, and proposed initial interaction designs that would bring what I view as an acceptable compromise between usability, user experience and security. I want to insist on the importance of keeping a clear semantic separation between authorisation and authentication, as both tasks have very different security risks associated and as the cost of authorisation can be greatly reduced by avoiding replacing it with the more interaction-heavy task of authentication.

Besides, credentials spoofing would be harder if all legitimate authentications they are exposed to are performed through a unique interface – so that they grow used to seeing exactly the same thing. So whatever solution we design for Wayland privileges must be re-usable by other FOSS projects that need to perform authorisation or authentication (e.g., password stores). Rather than reinventing the wheel, I think one should look to extend/adapt polkitd’s API (to distinguish between authorisation and authentication) and then constrain the APIs for polkitd authorisation and authentication agents to reflect on our identified requirements.

Wayland compositors could then use polkitd and their own auth agents to expose the Wayland-defined privileges and the others I discuss in this article, should they want to. I believe the Wayland project is an excellent place to first acknowledge the need for better polished auth UIs and to provide the necessary infrastructure laid out above. I hope to have demonstrated that building safe auth UIs goes far beyond the extent of just a desktop environment or just the graphic stack. The corollary to this is that compositor developers, distributors and ultimately app developers could/should be issued with recommendations on what next steps, so that we ultimately build a more secure and consistent experience for Linux desktop environment users. Hopefully others will agree with me and I will be able to take a FreeDesktop spec out of this article. If you too think fixing Linux’s security is worth the effort, please comment below!

Are you a student?

If you’re knowledgeable about usability evaluation (or ergonomics/interaction design/UX), I’m looking for someone to evaluate the various designs above (with an academic publication in mind). This can be made as a UCL MSc project supervised by me and Prof. Angela Sasse, and I’m keen to explore available options for non-UCL students willing to collaborate with us.