|
The X files: Window managers, desktop environments and widget libraries
Written by and © Piotr Gawrysiak in April 2005.
I must admit that the main inspiration for this short article was Marcin Wichary’s
excellent Graphical User Interface Gallery. It’s is a great resource for anyone
interested in usability and GUI design. There are not many things in this world that are
perfect, and – alas – Marcin’s GUIdebook is no exception. It gives a
very good overview of interfaces closely integrated with operating systems, such as
Mac OS, Amiga’s Workbench or (if integration is not an overstatement here) various
flavours of Windows. When it comes to Unix and Linux GUIs it was, at least at the time of
writing this article, a disaster. It covered only GNOME, and treated this desktop
environment as if it had been the system’s only possible interface.
At first I wanted to send Marcin a bitter letter bashing him for his ignorance of
Linux Desktop environments, X Window managers and toolkits, but then I realized that
I really do not have any right to do so. My knowledge about free OS GUIs is still
extremely limited and lots of my colleagues who use Linux at their workplace similarly
have no idea how their computer desktops really work.
Granted, all the intrinsics of X Window environment that we are using are described
in detail in documentation, including various howtos and tutorials, such as
The introduction to X11 User Interfaces or
X Window System Architecture Overview HOWTO .
However, as even a short browse through OS News archived
comments shows, not many people bother to read them. I decided therefore to distil
the documentation’s contents and briefly try to explain what these strange
beasts called window managers, desktop environments and toolkits are – and why
comparing, for example, Qt to GNOME to Blackbox (as is sometimes seen on discussion
fora), does not really make sense. Perhaps you will find it useful, or
at least entertaining...
A bit of history
If you are sitting in front of a Windows box you do not really have to bother with the
history of your graphical environment. Every bit of this GUI was created by Microsoft,
and either inspired by, or stolen from Macintosh interface – depending on your
IT religion preference (if you are a Mac OS user the story is even simpler, at least
as long as no one mentions (...) (wrzucic na GUIdebook)
Xerox Star or
starts talking about Jef Raskin –
at which point you really should consult GUIdebook). However, if you happen to be
staring at a GUI of some Unix-ish system, you should be aware that even a simple
questions, such as “who created the first version of my GUI?” cannot be
answered that easily.
Unix – and consequently Linux – is primarily an operating system without a
user interface, although this is not a place to substantiate this statement in detail.
I will only remark that one must often remember that there is much truth in almost
proverbial saying “Linux is just a kernel.” Similarly to other systems
that Unix begot, Linux can even work as a purely embedded system that does not
communicate at all with humans. Even the simplest interface available for Linux – the
command line – is provided not by one standard system application, but via
variety of shells with different functionality and behaviour (albeit undoubtedly
bash – Bourne Again Shell – is the most common). The shells, one has to
add, are normal applications and not some special system processes or tasks.
There were several attempts to provide Unix systems with WIMP-type GUI in the early
eighties. These included experiments with using PostScript as a rendering engine for
on screen graphics (as seen in NeXTSTEP operating system) or even interfacing Unix and
Mac OS (as in A/UX ). The
most successful was, however, a MIT project codenamed Athena, which in 1986 presented a
windowing system called X (currently maintained by an independent organization,
X.org ). The system is named just X,
or more properly X Window system, even though many people prefer to call it
X Windows – unnerving many Microsoft haters, for obvious reasons.
X Window system
Naturally, every computer system that wants to have a GUI needs a way to display
pixels on the screen and receive input from the users in order to able to interact
with them. These basic functions are fulfilled by X Window system.
The system itself has been designed in a so-called client-server architecture. This
basically means that its functionality has been split into two parts that communicate using
network protocols and thus can reside on separate machines – albeit nowadays in
most cases are running just as different processes on a single computer.
The server part is responsible for doing all the drawing on end user’s screen and
thus has to be running on a workstation machine in a networked, distributed setup. Obviously, if the machine in question is a PC, it has to be
able to communicate with graphics hardware and consequently must be equipped with
appropriate drivers.
The other part is the client – or rather many clients, because the client is itself
just the application. Note that in our example the applications (X server clients)
are not running on the end user workstation but on a server machine. This naming
may be (and usually is) quite confusing, but in fact it makes perfect sense – as
far as graphics operations are concerned, the user screen is most important, and
is being managed by a process that performs drawing operations at clients’ requests.
The networked nature of X Window system is a double-edged sword. Obviously it introduces
additional processing overhead on top of that what is required to actually perform
graphics operations – therefore X Window System-based GUI might be slower
than a monolithic system. Additionally, mostly due to physical separation of client
application from graphics hardware, many features taken for granted in traditional GUI
systems are only now being addressed (such as intelligent management of screen
redraws – properly handled by a relatively recent
XDamage extension ). On the
other hand it makes entire system very flexible, and especially suitable for
computationally intensive tasks. It is very easy to have several X applications running
on many different machines and still displaying their windows on one monitor in
a seamless way – a quite arcane (albeit possible) task for Windows systems.
This flexibility extends even further. It is important to remember that X Window does
not generally do much more than just displaying pixels, lines and bitmaps on the
screen, and routing information from keyboard and mouse to applications. It is
therefore only a foundation on top of which a variety of user interfaces are built.
It is an old Unix tradition, dating back to mainframe days, to perform complex tasks
with a set of simple utilities cooperating together – a philosophy (yes!) probably
best explained in Eric Raymond’s book The Art of Unix Programming .
All X-based user interfaces, being faithful citizens of Unix community, are
therefore composed with a variety of modules cooperating loosely together. Several
versions of these modules – such as widget toolkits, window managers and
desktop environments – are available, and they all come from different vendors,
organizations or even individual programmers.
Lastly, the X Server is not particularly dependent on any specific property of Unix
(or Linux, FreeBSD etc.). It is used primarily in these systems, mostly due to the fact
that it has been adopted as a standard long time ago and also because – frankly –
there is no real alternative. There is however nothing that would prevent creating an X
Server that would be a MS Windows application. In fact such products are available –
the most popular being probably open source Cygwin/X, accompanied by several commercial
solutions. It is therefore possible to run X applications on remote Linux machines,
and display their windows right on one’s Windows desktop...
Toolkits, widgets, libraries
In theory, an application could communicate with the X server directly, issuing
appropriate commands and sending data directly over a network protocol (such as
TCP/IP). While possible, this would be unpractical, as X protocol is very low-level
and the application should rather rely on functions contained in a library (Xlib)
containing routines related to communication with the server. But even with help
of Xlib, writing applications would be a very laborious task. As I already mentioned,
the X Server provides just most basic functions, such as simple drawing, text output,
and input handling. There are no provisions for displaying scroll bars, combo boxes,
buttons – all the objects commonly called widgets in GUI designers’
lingo must be created (in fact – painted) by an application itself.
Even small applications would therefore contain quite a lot of code in order to
be able to provide just basic interface functionality. An excellent
X11 User Interfaces tutorial includes a good example; to display a window with some text in it, a
programmer must type in... over 200 lines of C code.
Such situation is of course unacceptable for developers, and thus libraries containing
functions for drawing and managing widgets were created almost immediately alongside the
first versions of X Window System. Most often such libraries are called toolkit
libraries or widget toolkits. I will not describe them in details – for
examples, visit Wikipedia .
Some properties of toolkit libraries are, however, worth mentioning. First –
their functionality differs greatly. The simplest, such as an original Xt toolkit with
Athena widget set, do not offer much more besides functions for creating and
managing standard widgets. Modern, advanced toolkits, are however entire programming
environments, with their own event models, sound, animation, font handling capabilities
or even database access interfaces. The richest toolkits available today are probably Qt
from Trolltech (upon which a KDE desktop environment is based, but which is used
also by variety of other applications, for example Scribus) and GTK (invented for GIMP photo
manipulation program and used primarily by GNOME desktop environment).
Second property of the toolkits, one of a tremendous importance for GUI design, is
the appearance of the widgets drawn (or created) by them. I mentioned above that
X does not enforce any standard for appearance of buttons, menus and windows, and
provides only drawing capabilities. Because of that, every toolkit creates its own
standard, therefore two X applications created with different toolkits can have
completely dissimilar look and feel, and in extreme cases might even be unable to
interoperate. It is worth mentioning that, until recently, even rudimentary inter-application
actions, such as copying and pasting non-text contents, were practically impossible in X world.
The consistency of one’s computer desktop depends therefore on applications. As
long as all launched applications are using the same toolkit, their appearance will
be coherent. It is, however, very difficult to handpick a set of programs fulfilling more
than basic computing needs, which had all been written using just one particular toolkit.
Needless to say, such situation is unpleasant, so some elementary solutions have
been devised. Most common one is mimicry. Modern toolkits – such as Qt and GTK
mentioned above – are themable. In other words, the appearance of their
widgets can be controlled by an end user, who selects appropriate “skin” containing
bitmaps depicting button faces etc. (In more complex cases even code that actually
draws the widgets on screen.) It is therefore possible to use similarly
looking skin in two different toolkits, thus achieving at least visual consistency between
applications.
Window managers
X Server, together with toolkit libraries, allows applications to draw contents of their
windows on the screen, but it gives no provisions for doing anything with
these windows. Actions such as window dragging, resizing, minimizing, or displaying
window borders and icons are indeed outside of the scope of interest of application
developer – as is the application launching or drawing desktop background. In
X-based system this functionality is provided by a special application, called
a window manager.
From a purely technical point of view, a window manager is not a very complex
application – or rather it does not have to be, if it provides only basic
functions. Therefore, more window managers have been written than you would probably
expect, differing sometimes only in details. Those details are yet important; remember
that it is window manager, which draws windows borders and icons, and if these do
not match the style of your widgets, guess what happens to the consistency of your
desktop appearance.
At this point it should also probably be evident to you that most advanced window
managers are – you guessed it – themable...
A role of a window manager, as simple as it could be, can be very important for the
usability of the entire system. The ease of switching between windows and virtual
desktops or launching new applications (and there are almost as many methods as there
are window managers) depends entirely on this component.
This brings us to another consistency problem. As there is no standard for application
launcher, the applications themselves, or their installers, cannot automatically add
their own entries to it. Now, the standard actually have been created as part
of freedesktop.org
effort – but there is no way to force every window manager author to adopt it.
And to give credit where credit it is due – it is definitely not Microsoft’s
idea to combine all shortcuts to applications and settings in a single start
menu – many window managers used this approach way back in times of Windows
3.0 or even before.
The Matt Chapman’s Xwinman website provides quite comprehensive list of both mainstream and obscure window
managers, together with mandatory screenshots.
Desktop environments
The three components described above are sufficient to create a rudimentary, but
useful GUI that would allow you to work even with very recent X applications –
such as Firefox, Thunderbird, GIMP or XMMS. However, if you’re used to Windows
or Mac OS, you would quickly notice how limited this environment is. Even if you
are using a very rich window manager, you only get a handful of desk utilities
(most common ones would probably be a dock of some kind, and a set of small
applets displaying current time or system load). There is no standard file
manager, no clipboard viewer, no system configuration tools. If you happened to
program in Windows environment, for example using Win32 or MFC, you will be
further amazed. There is no standard printer support, no inter-windows communication,
no standard file open and save dialogs, etc.
Of course, most functionality mentioned above can be easily added. For example, there
are several file managers that work well in X. The main problem is – again –
consistency. While the applications can be useful, their interfaces are designed in
different ways and there is no integration between them and other parts of the
system (apart, of course, from the mandatory stuff such as the capability of querying
and manipulating a file system in the case of a file manager).
What is therefore needed is an additional layer of applications (for the end user)
and services (for programmer) that glue together individual applications and create
a seamless user experience – similar to that offered by modern Windows or
Mac OS desktops. In short, this layer adds an additional application (or set of
applications) on top of X Server and window manager. That application manages
desktop, provides file browsers, system settings, clipboard integration etc., and
exposes an API for programmers willing to create applications that look –
and behave – like integral parts of the system.
Historically, the first, very primitive desktop environment for X was probably
CDE (Common Desktop Environment), created and promoted during the end of a Unix
system era. Obviously a closed source application, it could be encountered on legacy servers
running proprietary Unix systems such as AIX or HP-UX. While widely used in its time,
it never became very popular with application developers and is now a practically extinct
species.
The first attempt of open source community to create a desktop environment was the
KDE project , started by Mattias
Ettrich back in 1996. Now at version 3.4, KDE has grown into an extremely complex
project, offering functionality and level of integration between applications
sometimes surpassing that of Microsoft Windows.
KDE is based on a Qt widget toolkit from Trolltech .
While technically superior (in 1996) to other toolkits on the market, Qt had
one significant disadvantage – its licence was not compatible with GPL. This
upset quite a lot of people, including Richard Stallman himself and – to cut
the story short – led to creation of a rival desktop environment,
GNOME . Since then the Qt license
has been changed, and KDE is now a good citizen of the open source community to
some people its is still not “pure” enough...
It is hard to tell if the creation of GNOME was actually beneficial for the quality
of X-based interfaces. On one hand, it provided KDE with a much-needed
competition (even though both these environments are still mostly playing the catch
up game with Windows and Mac OS, feature- and integration-wise). On the other hand,
it has created a duality in application space (usually one creates either a
native KDE application or a GNOME one, not both), essentially dividing developer’s
community into two camps, not always cooperating on very friendly terms. Of course
both KDE and GNOME applications are just X clients using some special libraries, so these
can run simultaneously on one computer, but this creates consistency problems that
were the very reason for which desktop environments were created in the first place.
KDE and GNOME are of course not the only desktop environments available. Some
notable examples include XFCE (a desktop environment designed with performance in
mind), GNUStep (a spiritual successor of the famous NeXTSTEP operating system) and XPde
(an environment mimicking the Windows XP interface). However, a desktop environment
is only useful as long as there is a large applications base that could use its
services; even best integration capabilities are of no use if you have nothing to
integrate, and, for example, you could probably count XFCE-native applications using
just your fingers! The situation is even worse with XPde – there are simply
no third-party programs that integrate with it. This shows that a line separating
desktop environment systems from these running only a window manager, is very blurred.
So much, in fact, that I personally dislike calling XFCE a desktop environment at all...
Final remarks
One big problem makes system integration quite difficult for desktop environments
programmers: practically all these environments are portable (for example, you can
run KDE on Linux, on FreeBSD, or even on Windows). This means that all operating
system-specific solutions – such as controlling the hardware directly –
are practically unavailable to a desktop environment. However, in order for
the environment to create a seamless user experience, the complete control over
the entire system is necessary. The efforts are underway that aim to solve this
puzzle (for example, DBUS and HAL projects), but they are yet far from being complete.
Moreover, it is possible to modify GNOME or KDE desktops – or for this matter,
any other open source desktop environment – by substituting icons, changing skins
or reordering menus and application lists. It is quite common for distribution
packagers to create their own, branded versions of desktop environments in this
way. A good example would be a RedHat (and later Fedora) default environment –
the visual themes of both KDE and GNOME desktops that ship with it were altered
to such extent (with a theme called Bluecurve), that a new interface has been
created. Additionally, due to the flexibility of the entire system, it is
possible to combine various different components of the X stack (if I might use
such a name) and effectively cook your own, unique GUI. For example, while both
GNOME and KDE are equipped with their own window managers, it is perfectly possible to
change them to any other window manager (and thus have window borders in KDE
desktop that do not really belong there, in a graphical sense).
All this means that it is extremely difficult to define a standard Linux
desktop, even if one were to constrain themselves only to a single distribution.
And it should be obvious now that GIMP or Firefox are not really GNOME applications.
They look like native parts of the desktop (using mimicry skinning techniques) and use the
same toolkit, but that’s about it.
Summing up
If you are old enough to remember the last days of DOS glory, you might notice that
the situation in X Window world is not dissimilar. It’s a mess, but with some
islands of sanity – namely KDE and GNOME.
We all know very well who nailed the DOS coffin. In the case of X Window System, it
is hard to predict whether a single GUI standard for all applications will be
created. Yet there is hope, thanks to the efforts of people assembled under
freedesktop.org banner. If this initiative proves to be successful (and it already
gave us some goodies, such as partial integration between GNOME and KDE
applications) we may get what Microsoft was never able to provide –
unity in diversity, at least as far as the graphical user interface of
our machines is concerned...
Finally, is all that I wrote about important for a user sitting in front of
a computer screen and trying to do some word processing or web browsing? It
shouldn’t, therefore developers must bear in mind to keep all this
complexity for themselves. What most of us need is a seamless user experience,
and not excuses like “Linux is just a kernel.” From the end user’s
point of view there is really no excuse for lack of graphical system configuration
tools integrated with desktop environment, or poor quality of desktop
environment. For example, if KDE crashes (and this system is not very stable,
at least by Windows standards1), most users perceive it
as a Linux crash – in spite of the fact that the kernel is still
running flawlessly underneath.
Piotr Gawrysiak
1 Yes, I have seen KDE crash handler many more times than Windows crash
notifications, and I use both systems equally often.
|