All Window Managers Suck
My take on today’s desktop computer interfaces


1 Introduction

I hate inefficient computer interfaces. I run programs on my computer so I can use them, not manage their layout. Managing windows is mindless overhead, so the computer should help take care of that overhead. Fortunately, there is some pretty neat software that does exactly that, but I think they conflict with each other and create a confusing, unintuitive, hostile environment even for enthusiasts!

1.1 Motivations

I’m always confused when I have to switch contexts. Computers, in particular, inflict multiple different paradigms onto me, all the time, at the same time. I believe that a significant amount of context switching can be avoided by defining who’s responsibility it is to manage windows. The job of window management is truly overloaded—there should only be one program that does it well [?].

Firefox and Emacs are among the worst offenders, but they came into existence due to either a lack of features in the (X11) window manager, or a lack of a window manager altogether. Emacs, by itself, is sufficient for almost all day-to-day tasks. Fetch/send your email, browse the web, message your friends, and edit text. However, this eliminates the majority of applications—those that are not built on top of Emacs, and rely on widget toolkits (GTK, QT, etc.).

1.2 Goals

To create a desktop environment. I want to create a window manager, and define a set of graphical tools that go well together. Many of these tools might need to communicate with the window manager so I will need to build some of these tools; thanks to the open source community (especially the suckless community), I can start with high quality, modular, graphical software (st, surf, etc.) and piece them together. I also plan on taking some bloated graphical software (Emacs, Vim, etc.) and stripping them down via plugins and/or patches.

To support touch and mouse. See 5.2.

To make it abstract. The behavioral code should be separate from the driver code, just like in the Linux kernel. This makes the code simpler and easier to hack. See 4.1.

2 The problems with modern window management

2.1 Window managers are everywhere

More precisely, there are too many window managers contending for your attention. In this section, I will describe a couple window managers.

Firefox is a window manager with support for both workspaces and tabs. It is a special window manager, in that it can only display one window at a time.



keystroke action


Alt-<num> switch to tab <num>


Ctrl-Tab switch to next tab


Ctrl-Shift-Tabswitch to previous tab


Ctrl-` switch to next workspace


Ctrl-Shift-` switch to previous workspace


Most terminal emulators are also window managers. These are the default tab-related keybindings for gnome-terminal:



keystroke action


Alt-<num> switch to tab <num>


Shift-<right>switch to next tab


Shift-<left> switch to previous tab


Some urxvt and xterm users keep Screen/Tmux running in their local terminal sessions. Others might use tabbed for grouping multiple terminals in one x window. Here are some of the keybindings for switching buffers:





screen tmux tabbed action




Ctrl-a <num>Ctrl-b <num>Ctrl-<num> switch to window <num>




Ctrl-a n Ctrl-b n Ctrl-Shift-lswitch to next window




Ctrl-a p Ctrl-b p Ctrl-Shift-hswitch to previous window




Emacs and Vim are both window managers. Vim tends to use the control key, while Emacs tends to use both control and alt. It is worth noting that they are both pretty shitty window managers for X, but less shitty for text. TODO explain...

2.2 Conflicting and numerous keybindings

Switching tabs is fundamentally related to managing windows. There appears to be an unofficial attempt to standardize notebook keybindings, with Alt-<num> being the de-facto standard for switching to tab <num>, but this conflicts with most tiling window managers which tend to use that same pattern for switching tags. Firefox and all tabbed GNOME applications use Alt-<num>, as well as Xmonad, Awesome, DWM, i3, and other tiling window managers. This is an example of conflicting keybindings.

Changing window focus is an action that tends to be mapped to completely different keybindings across different programs. This means that I will always be able to switch windows no matter where I am, but I also have to memorize multiple keybindings for switching windows based on the context. This is a problem of numerous keybindings, and there are so many of them that I can hardly keep track.

A non-technical observer might ask why all of this is has to be so inconsistent, and an informed computer scientist might offer some history and a technical excuse along the lines of, “we would have to change many things to fix it, but it doesn’t bother me so you should just get used to it”. One of my goals is to convince the reader that fixing the windowing system is a worthwhile endeavor.

TODO: describe uniform/consistent interfaces. Aqua (Ctrl-w and Ctrl-q), Emacs OS (email, irc, calendar, todo list, and text editor)

3 Discussion

3.1 related projects

3.1.1 wmii

Wmii is a tiling window manager (from suckless) that is designed to mimic many aspects of the acme text window system. It provides a way to group windows together and indicates groups by stacking window titles.

TODO: figure for stacked window titles.

Wmii has apparently been abandoned by suckless, in favor of the much simpler DWM and helper programs (dmenu and tabbed). Modularity and simplicity are very admirable, but I think that grouping windows are necessarily part of the window manager. Window groups in wmii has inspired the creation of the i3 window manager which takes this idea to the extreme and offers multiple methods for grouping windows in arbitrarily deep heirarchies.

3.1.2 tabbed

TODO

3.1.3 i3

TODO

3.1.4 StumpWM / DSWM

TODO

keybindings, “emacs integration”

3.1.5 Xmonad

TODO

keystroke stringing with mkKeymap.

3.2 Communication between windows, is it necessary?

In programs like Firefox, tabs are somewhat independent of each other. To exemplify this, tabs can sometimes be detached from the parent window, spawning a new window containing just the one tab you detached.


+-----------------+       +-----------------+  +-----------------+  
| A | B | C |     |       | B | C |         |  | A |             |  
|-----------------|       |-----------------|  |-----------------|  
|                 |       |                 |  |                 |  
|                 |  ==>  |                 |  |                 |  
|        A        |       |        B        |  |        A        |  
|                 |       |                 |  |                 |  
|                 |       |                 |  |                 |  
+-----------------+       +-----------------+  +-----------------+


Figure 1: Detaching tab A from the master window.


Conceptually, this feature suggests a top level grouping of independent windows rather than a master window with a couple of lowly panels. This is an important distinction because it will help us determine which programs are good candidates for our window manager.

However, not all tabs are so independent—communication between tabs does sometimes occur. Spawning a new gnome-terminal from an existing gnome-terminal will cause the second terminal to automatically cd to the working directory of the first terminal. From any Firefox tab, pressing Ctrl-T will reopen a previously closed tab. There are commands in Emacs which do operations on all opened buffers, such as the multi-occur command which basically grep’s multiple open buffers. Also, any time a new keybinding or Elisp function/symbol is defined, they take effect across all buffers. Perhaps the most obvious form of communication between Emacs-managed windows is the fact that they share the same set of buffers (see 3.3 for discussion on shared buffers).

3.3 Shared buffers


+---------------------+                 +---------------------+  
|          |          |                 |          |          |  
|          |    B     |                 |          |    B     |  
|          |          |   C-x <right>   |          |          |  
|    A     |----------|  ============>  |    B     |----------|  
|          |          |                 |          |          |  
|          |    C     |                 |          |    C     |  
|          |          |                 |          |          |  
+---------------------+                 +---------------------+


Figure 2: Switching buffers inside of a window (Emacs).


Note that buffer B appears in two windows at the same time! This is not normal behavior for X window managers, but common among text editors.

3.4 Formalization of buffers

For the purpose of writing this article, I will define some terms related to buffers and windows. The table below outlines what I think are the three fundamental components of window management.



TBterminal buffer


CB configuration buffer


NB notebook buffer


Examples of each are listed:

Terminal buffers (TB): Emacs, Firefox, and gnome-terminal are terminal buffers w.r.t. the X11 window manager. Emacs buffers are terminal buffers w.r.t. Emacs. Web sites are terminal buffers w.r.t. Firefox.

Configuration buffers (CB): Vertical split, horizontal split, etc.


+-----------------+  +-----------------+  +-----------------+  
|                 |  |        |        |  |        |        |  
|                 |  |        |        |  |        |   B    |  
|                 |  |        |        |  |        |        |  
|        A        |  |   A    |   B    |  |   A    |--------|  
|                 |  |        |        |  |        |        |  
|                 |  |        |        |  |        |   C    |  
|                 |  |        |        |  |        |        |  
+-----------------+  +-----------------+  +-----------------+


Figure 3: Examples of configuration buffers (CB).


Notebook buffers (NB): Tabs, basically.


+-----------------+  +-----------------+  +-----------------+  
| A | B | C |     |  | A | B | C |     |  | A | B | C |     |  
|-----------------|  |-----------------|  |-----------------|  
|                 |  |                 |  |                 |  
|                 |  |                 |  |                 |  
|        A        |  |        B        |  |        C        |  
|                 |  |                 |  |                 |  
|                 |  |                 |  |                 |  
+-----------------+  +-----------------+  +-----------------+


Figure 4: Examples of notebook buffers (NB).


In many window managers, multple types of buffers are supported, often nested and with a fixed heiarchical structure. Here are some observations:


SVG-Viewer needed.

  

SVG-Viewer needed.

  

SVG-Viewer needed.

  

SVG-Viewer needed.


Figure 5: Applying our definitions to some window managers. (“buffer structures”)


The Firefox window manager cannot do vertical/horizontal split, so its corresponding buffer structure does not contain a configuration buffer (CB) node. Emacs, however, supports vertical split at the very top level, so it has a CB node at the root.

Now, Imagine what happens when you are running Firefox and Emacs side-by-side inside of DWM:


SVG-Viewer needed.


Figure 6: Example of Buffer structure expansion.


In this example scenario, since Firefox/Emacs act as TB w.r.t. DWM, the TB of DWM is replaced with the roots of the Firefox/Emacs buffer structures.

4 Design Philosophy

4.1 Abstract

This window manager should not be monolithic. It should have at least two parts: behavioral and driver code. People should be able to hack on the behavioral code without understanding the XCB API. Theoretically, such a window manager could support both X and Wayland, and even Windows. Such an abstraction layer would necessarily only support a subset of

A very simple window manager such as DWM can afford to be monolithic because it has a small code base, but I plan on adding more functionality than what DWM supports. This isn’t to say that I think DWM is too simple—in fact, I use DWM and truly believe that it is “complete”—but I plan on shifting around the responsibilities of managing windows, so I will be creating a context in which DWM will be insufficient.

4.2 Hackable

I plan on writing the behavioral part in Guile Scheme. As in Emacs, most actions will be named commands, and various keybindings and mouse gestures mapped to those named commands. Like StumpWM, there will be a command input line which can be used to evaluate scheme.

5 Implementation

5.1 Modes

Just like Emacs, there will be major and minor modes. First, let me describe the i3 window manager: i3 is a very flexible and generic window manager that allows users to construct arbitrarily complex workspaces by nesting i3 windows inside of each other. Internally, this is stored as a tree structure (there are some screencasts that show the tree in real time [?]), but I think this is needlessly confusing to anybody who doesn’t have infinite brain RAM. A major mode can be applied to a subtree which will require the nodes in that subtree to behave a certain way. For example, web-browser-mode might default to top level tabs (instead of vertical-split or stack) and might attempt to catch any newly opened web browser windows. text-mode might be my reimplementation of Emacs! Minor modes would actually be global modes like bar-mode and mouse-mode (see 5.2). Compared to i3, this method is more complex to implement, but I suspect it will be less complex to the user.

5.2 Mouse

Normally, no window decorations (besides thin borders) or panels are drawn. bar-mode should toggle the visibility of the bar. By default, there are two ways to expose the bar: 1) invoke bar-mode from the window manager command input, and 2) bump the mouse on the top edge of the screen to temporarily enable bar-mode, then disable it when mouse returns.

Bump the upper left corner of the screen to enable mouse-mode. In mouse-mode, the following things are true:



EOF


last updated: 2014-06-07
source document
CC BY Troy Sankey