All Window Managers Suck
My take on today’s desktop computer interfaces
I hate inefficient computer interfaces. I run programs on my computer so I can use them, not manage their layout. Managing windows is mindless overhead, so the computer should help take care of that overhead. Fortunately, there is some pretty neat software that does exactly that, but I think they conflict with each other and create a confusing, unintuitive, hostile environment even for enthusiasts!
I’m always confused when I have to switch contexts. Computers, in particular, inflict multiple different paradigms onto me, all the time, at the same time. I believe that a significant amount of context switching can be avoided by defining who’s responsibility it is to manage windows. The job of window management is truly overloaded—there should only be one program that does it well [?].
Firefox and Emacs are among the worst offenders, but they came into existence due to either a lack of features in the (X11) window manager, or a lack of a window manager altogether. Emacs, by itself, is sufficient for almost all day-to-day tasks. Fetch/send your email, browse the web, message your friends, and edit text. However, this eliminates the majority of applications—those that are not built on top of Emacs, and rely on widget toolkits (GTK, QT, etc.).
To create a desktop environment. I want to create a window manager, and define a set of graphical tools that go well together. Many of these tools might need to communicate with the window manager so I will need to build some of these tools; thanks to the open source community (especially the suckless community), I can start with high quality, modular, graphical software (st, surf, etc.) and piece them together. I also plan on taking some bloated graphical software (Emacs, Vim, etc.) and stripping them down via plugins and/or patches.
To support touch and mouse. See 5.2.
To make it abstract. The behavioral code should be separate from the driver code, just like in the Linux kernel. This makes the code simpler and easier to hack. See 4.1.
More precisely, there are too many window managers contending for your attention. In this section, I will describe a couple window managers.
Firefox is a window manager with support for both workspaces and tabs. It is a special window manager, in that it can only display one window at a time.
|Alt-<num>||switch to tab <num>|
|Ctrl-Tab||switch to next tab|
|Ctrl-Shift-Tab||switch to previous tab|
|Ctrl-`||switch to next workspace|
|Ctrl-Shift-`||switch to previous workspace|
Most terminal emulators are also window managers. These are the default tab-related keybindings for gnome-terminal:
|Alt-<num>||switch to tab <num>|
|Shift-<right>||switch to next tab|
|Shift-<left>||switch to previous tab|
Some urxvt and xterm users keep Screen/Tmux running in their local terminal sessions. Others might use tabbed for grouping multiple terminals in one x window. Here are some of the keybindings for switching buffers:
|Ctrl-a <num>||Ctrl-b <num>||Ctrl-<num>||switch to window <num>|
|Ctrl-a n||Ctrl-b n||Ctrl-Shift-l||switch to next window|
|Ctrl-a p||Ctrl-b p||Ctrl-Shift-h||switch to previous window|
Emacs and Vim are both window managers. Vim tends to use the control key, while Emacs tends to use both control and alt. It is worth noting that they are both pretty shitty window managers for X, but less shitty for text. TODO explain...
Switching tabs is fundamentally related to managing windows. There appears to be an unofficial attempt to standardize notebook keybindings, with Alt-<num> being the de-facto standard for switching to tab <num>, but this conflicts with most tiling window managers which tend to use that same pattern for switching tags. Firefox and all tabbed GNOME applications use Alt-<num>, as well as Xmonad, Awesome, DWM, i3, and other tiling window managers. This is an example of conflicting keybindings.
Changing window focus is an action that tends to be mapped to completely different keybindings across different programs. This means that I will always be able to switch windows no matter where I am, but I also have to memorize multiple keybindings for switching windows based on the context. This is a problem of numerous keybindings, and there are so many of them that I can hardly keep track.
A non-technical observer might ask why all of this is has to be so inconsistent, and an informed computer scientist might offer some history and a technical excuse along the lines of, “we would have to change many things to fix it, but it doesn’t bother me so you should just get used to it”. One of my goals is to convince the reader that fixing the windowing system is a worthwhile endeavor.
TODO: describe uniform/consistent interfaces. Aqua (Ctrl-w and Ctrl-q), Emacs OS (email, irc, calendar, todo list, and text editor)
Wmii is a tiling window manager (from suckless) that is designed to mimic many aspects of the acme text window system. It provides a way to group windows together and indicates groups by stacking window titles.
TODO: figure for stacked window titles.
Wmii has apparently been abandoned by suckless, in favor of the much simpler DWM and helper programs (dmenu and tabbed). Modularity and simplicity are very admirable, but I think that grouping windows are necessarily part of the window manager. Window groups in wmii has inspired the creation of the i3 window manager which takes this idea to the extreme and offers multiple methods for grouping windows in arbitrarily deep heirarchies.
keybindings, “emacs integration”
keystroke stringing with mkKeymap.
In programs like Firefox, tabs are somewhat independent of each other. To exemplify this, tabs can sometimes be detached from the parent window, spawning a new window containing just the one tab you detached.
Conceptually, this feature suggests a top level grouping of independent windows rather than a master window with a couple of lowly panels. This is an important distinction because it will help us determine which programs are good candidates for our window manager.
However, not all tabs are so independent—communication between tabs does sometimes occur. Spawning a new gnome-terminal from an existing gnome-terminal will cause the second terminal to automatically cd to the working directory of the first terminal. From any Firefox tab, pressing Ctrl-T will reopen a previously closed tab. There are commands in Emacs which do operations on all opened buffers, such as the multi-occur command which basically grep’s multiple open buffers. Also, any time a new keybinding or Elisp function/symbol is defined, they take effect across all buffers. Perhaps the most obvious form of communication between Emacs-managed windows is the fact that they share the same set of buffers (see 3.3 for discussion on shared buffers).
Note that buffer B appears in two windows at the same time! This is not normal behavior for X window managers, but common among text editors.
For the purpose of writing this article, I will define some terms related to buffers and windows. The table below outlines what I think are the three fundamental components of window management.
Examples of each are listed:
Terminal buffers (TB): Emacs, Firefox, and gnome-terminal are terminal buffers w.r.t. the X11 window manager. Emacs buffers are terminal buffers w.r.t. Emacs. Web sites are terminal buffers w.r.t. Firefox.
Configuration buffers (CB): Vertical split, horizontal split, etc.
Notebook buffers (NB): Tabs, basically.
In many window managers, multple types of buffers are supported, often nested and with a fixed heiarchical structure. Here are some observations:
The Firefox window manager cannot do vertical/horizontal split, so its corresponding buffer structure does not contain a configuration buffer (CB) node. Emacs, however, supports vertical split at the very top level, so it has a CB node at the root.
Now, Imagine what happens when you are running Firefox and Emacs side-by-side inside of DWM:
In this example scenario, since Firefox/Emacs act as TB w.r.t. DWM, the TB of DWM is replaced with the roots of the Firefox/Emacs buffer structures.
This window manager should not be monolithic. It should have at least two parts: behavioral and driver code. People should be able to hack on the behavioral code without understanding the XCB API. Theoretically, such a window manager could support both X and Wayland, and even Windows. Such an abstraction layer would necessarily only support a subset of
A very simple window manager such as DWM can afford to be monolithic because it has a small code base, but I plan on adding more functionality than what DWM supports. This isn’t to say that I think DWM is too simple—in fact, I use DWM and truly believe that it is “complete”—but I plan on shifting around the responsibilities of managing windows, so I will be creating a context in which DWM will be insufficient.
I plan on writing the behavioral part in Guile Scheme. As in Emacs, most actions will be named commands, and various keybindings and mouse gestures mapped to those named commands. Like StumpWM, there will be a command input line which can be used to evaluate scheme.
Just like Emacs, there will be major and minor modes. First, let me describe the i3
window manager: i3 is a very flexible and generic window manager that allows users
to construct arbitrarily complex workspaces by nesting i3 windows inside
of each other. Internally, this is stored as a tree structure (there are some
screencasts that show the tree in real time [
Normally, no window decorations (besides thin borders) or panels are drawn.
Bump the upper left corner of the screen to enable