An example of the quote Post Format

Unix has wired into it an assumption that the programmer knows best. It doesn’t stop you or request confirmation when you do dangerous things with your own data, like issuing rm -rf *. On the other hand, Unix is rather careful about not letting you step on other people’s data. In fact, Unix encourages you to have multiple accounts, each with its own attached and possibly differing privileges, to help you protect yourself from misbehaving programs.[22] System programs often have their own pseudo-user accounts to confer access to special system files without requiring unlimited (or superuser) access.

Unix has at least three levels of internal boundaries that guard against malicious users or buggy programs. One is memory management; Unix uses its hardware’s memory management unit (MMU) to ensure that separate processes are prevented from intruding on the others’ memory-address spaces. A second is the presence of true privilege groups for multiple users — an ordinary (nonroot) user’s processes cannot alter or read another user’s files without permission. A third is the confinement of security-critical functions to the smallest possible pieces of trusted code. Under Unix, even the shell (the system command interpreter) is not a privileged program.

The strength of an operating system’s internal boundaries is not merely an abstract issue of design: It has important practical consequences for the security of the system.

To design the perfect anti-Unix, discard or bypass memory management so that a runaway process can crash, subvert, or corrupt any running program. Have weak or nonexistent privilege groups, so users can readily alter each others’ files and the system’s critical data (e.g., a macro virus, having seized control of your word processor, can format your hard drive). And trust large volumes of code, like the entire shell and GUI, so that any bug or successful attack on that code becomes a threat to the entire system.

File Attributes and Record Structures

Unix files have neither record structure nor attributes. In some operating systems, files have an associated record structure; the operating system (or its service libraries) knows about files with a fixed record length, or about text line termination and whether CR/LF is to be read as a single logical character.

In other operating systems, files and directories can have name/attribute pairs associated with them — out-of-band data used (for example) to associate a document file with an application that understands it. (The classic Unix way to handle these associations is to have applications recognize ‘magic numbers’, or other type data within the file itself.)

OS-level record structures are generally an optimization hack, and do little more than complicate APIs and programmers’ lives. They encourage the use of opaque record-oriented file formats that generic tools like text editors cannot read properly.

File attributes can be useful, but (as we will see in Chapter 20) can raise some awkward semantic issues in a world of byte-stream-oriented tools and pipes. When file attributes are supported at the operating-system level, they predispose programmers to use opaque formats and lean on the file attributes to tie them to the specific applications that interpret them.

To design the perfect anti-Unix, have a cumbersome set of record structures that make it a hit-or-miss proposition whether any given tool will be able to even read a file as the writer intended it. Add file attributes and have the system depend on them heavily, so that the semantics of a file will not be determinable by looking at the data within it.

An example of the link Post Format

Unix has a couple of unifying ideas or metaphors that shape its APIs and the development style that proceeds from them. The most important of these are probably the “everything is a file” model and the pipe metaphor[20] built on top of it. In general, development style under any given operating system is strongly conditioned by the unifying ideas baked into the system by its designers — they percolate upwards into applications programming from the models provided by system tools and APIs.

Accordingly, the most basic question to ask in contrasting Unix with another operating system is: Does it have unifying ideas that shape its development, and if so how do they differ from Unix’s?

To design the perfect anti-Unix, have no unifying idea at all, just an incoherent pile of ad-hoc features.

Multitasking Capability

One of the most basic ways operating systems can differ is in the extent to which they can support multiple concurrent processes. At the lowest end (such as DOS or CP/M) the operating system is basically a sequential program loader with no capacity to multitask at all. Operating systems of this kind are no longer competitive on general-purpose computers.

At the next level up, an operating system may have cooperative multitasking. Such systems can support multiple processes, but a process has to voluntarily give up its hold on the processor before the next one can run (thus, simple programming errors can readily freeze the machine). This style of operating system was a transient adaptation to hardware that was powerful enough for concurrency but lacked either a periodic clock interrupt[21] or a memory-management unit or both; it, too, is obsolete and no longer competitive.

Unix has preemptive multitasking, in which timeslices are allocated by a scheduler which routinely interrupts or pre-empts the running process in order to hand control to the next one. Almost all modern operating systems support preemption.

Note that “multitasking” is not the same as “multiuser”. An operating system can be multitasking but single-user, in which case the facility is used to support a single console and multiple background processes. True multiuser support requires multiple user privilege domains, a feature we’ll cover in the discussion of internal boundaries a bit further on.

To design the perfect anti-Unix, don’t support multitasking at all — or, support multitasking but cripple it by surrounding process management with a lot of restrictions, limitations, and special cases that mean it’s quite difficult to get any actual use out of multitasking.

blog-2

The Importance of Being Textual

In the early minicomputer days of Unix, this was still a fairly radical idea (machines were a great deal slower and more expensive then). Nowadays, with every development shop and most users (apart from the few modeling nuclear explosions or doing 3D movie animation) awash in cheap machine cycles, it may seem too obvious to need saying.

Somehow, though, practice doesn’t seem to have quite caught up with reality. If we took this maxim really seriously throughout software development, most applications would be written in higher-level languages like Perl, Tcl, Python, Java, Lisp and even shell — languages that ease the programmer’s burden by doing their own memory management (see [Ravenbrook]).

And indeed this is happening within the Unix world, though outside it most applications shops still seem stuck with the old-school Unix strategy of coding in C (or C++). Later in this book we’ll discuss this strategy and its tradeoffs in detail.

One other obvious way to conserve programmer time is to teach machines how to do more of the low-level work of programming. This leads to…

Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.

Human beings are notoriously bad at sweating the details. Accordingly, any kind of hand-hacking of programs is a rich source of delays and errors. The simpler and more abstracted your program specification can be, the more likely it is that the human designer will have gotten it right. Generated code (at every level) is almost always cheaper and more reliable than hand-hacked.

We all know this is true (it’s why we have compilers and interpreters, after all) but we often don’t think about the implications. High-level-language code that’s repetitive and mind-numbing for humans to write is just as productive a target for a code generator as machine code. It pays to use code generators when they can raise the level of abstraction — that is, when the specification language for the generator is simpler than the generated code, and the code doesn’t have to be hand-hacked afterwards.

In the Unix tradition, code generators are heavily used to automate error-prone detail work. Parser/lexer generators are the classic examples; makefile generators and GUI interface builders are newer ones.

(We cover these techniques in Chapter 9.)