Improving V6 Unix


Intro

The moderately early
PDP-11 versions of Unix such as V6 Unix packed an incredible amount of power into a extemely small amount of space - for V6, a mere 20KB of code (not including device drivers) for the permanently resident kernel - a bang/buck ration that will almost certainly never be exceeded.

It's relatively easy to bring up a V6 PDP-11 Unix under the Ersatz-11 PDP-11 emulator, as covered in the Bringing up V6 Unix on the Ersatz-11 PDP-11 Emulator page. This is an addendum/ancillary page, for those who wish to go further.

It covers a range of topics, including how to get the Standard I/O Library, and the 'tar' command (which, alas, needs system mods to really be supported 'right'), means to keep file write timestamps when copying/moving them, how to have the system default to more reasonable input editing characters, etc.


Other Unix stuff

Here are a bunch of other things that are useful for a PDP-11 V6 Unix.

Tim Shoppa's material

There's another set of RL02 V6 disk pack images lying around, from
Tim Shoppa. Most of it's junk (old biology lab data), but the root pack here has a ton of treasures on it.

Most importantly, it has a newer C compiler, one with support for longs, unsigneds, and a bunch of other stuff. (You could hack unsigneds in the 'vanilla' V6 C compiler, by using a char *, but that didn't always work.) In addition to /bin/ncc, you need the entire /xlib directory - don't rename it, or move the files to /lib; just copy it over as is.

Warning: for return of long data items from procedures to work (long returns use R1 as well as R0), you need a new csv.s, since the one from vanilla V6 bashes R1 on a procedure return. (Hey, R1 isn't used on return in that version, so bashing it is fine.) It goes in the C library, /lib/libc.a, once you've assembled it.

To get access to the contents of that disk pack image, you need to add either the RL or RP driver to your Unix (see here and here for how to do that).

In addition to the new C compiler, it has a ton of new commands. Alas, all we have are the binaries... :-( Too bad a source disk didn't get saved, along with all those useless packs full of biology stuff!

I haven't fully explored what's there, but one thing that is there is a new cdb, with an extended command set. Note: Some of the commands use some new Unix system calls which aren't in 'vanilla' V6, so they may blow out when you try and use them.

There's also complete kernel code, including code to bring it up (mostly) to v7 compatability, something else I haven't completely explored.



Standard I/O Library

One of the University of New South Wales tapes,
here, has a copy of the source for the Standard I/O Library (which vanilla V6 does not include).

I haven't actually used it much, since there's a copy of the compiled library on the Shoppa disk, in /lib/libS.a. One hitch is that that library is in the so-called 'new archive' format, which the vanilla V6 tools don't grok. That disk has the new archive tool ('nar'), and you can either unpack the library with that, and then repack it with the old archive tool, or use a copy of the library which I did that to already, here.

(I think I got the files in there in the right order... but maybe not! So look out for unresolved references. If you see some, the hack fix is to specify that library twice in the command line [xxx -lS -lS], and that should kludge it for the moment.)

Another hitch is that some of the calls require capabilities the Unix V6 kernel doesn't have (and can't be simulated). The lseek() system call is one example; one can simulate the call to it in the C library routine fseek(), by using block and byte seek() calls (and I did, see the code here), but the problem is that the lseek() system call also returns the file offset pointer (which ftell(), among others, uses), and there's no way to get that info in V6 (that I knew of).

So fully supporting the Standard I/O Library requires a system mod (below).


System mods for Stdio and others; additional library things


lseek() and smdate()

Adding lseek() is pretty easy. I have the code in a new file sys5.c, available
here; it also needs an entry in sysent.c, an updated copy of which is here.

Note: That copy of sysent.c also has the entry for the smdate() system call un-commented-out. That's because 'tar' (below) kinda-sorta requires the ability to change file modified dates to work 'properly' (as in, how it usually works on other systems), and I needed smdate() for that (see the 'tar' entry for the details). The fixes to keep file write data (below) also use smdate().

So you'll also have to un-comment-out the code for that call, in sys4.c; or if you don't feel like wrestling with it in 'ed' (or extracting it and editing it on your host machine), you can just download a fixed copy of it here.

Compile them all (don't forget the "-O" flag!), and add them to lib1, viz.:

ar r ../lib1 sys*.o
and then build a new system (see here for how to do that).

While doing 'tar', at one point I thought I needed the utime() system call, so I prepared a copy of it for V6 (available here if you're curious, including a V7 version of the iupdat() internal system routine), but since it turned out I didn't need it, I never bothered adding it.



C Library additions

The standard C library doesn't have an entry for lseek() (the routine C user code calls to do an lseek() system call), of course; there is one in the Standard I/O Library, or if you want to add one to the normal C library, the source is
here.

I have yet to add the access() system call, but for the moment you can fake it with this library routine. (This only works for code you're compiling, of course; when trying to use an existing binary off the Shoppa pack, this won't help you.)



Missing routines

While you're at it, you might want to add to the C library
alloca(), which for some reason isn't in V6; along with mktemp().

Keeping file modification time unchanged

V6, in its simplicity, updates the modification timestamp on a file anytime you touch it in any way - which includes when moving it, doing a 'chmod', etc. I found this non-optimal, since I like to know when a file's contents were last played with. So here are a bunch of commands which have been updated to keep that timestamp when doing something which doesn't modify the actual contents. Source here:
and binary here: if you're too lazy to compile them. Note that they do this by using the smdate() system call, so i) you will have to update the system to include it back in (see above for how to do this); and ii) they only work for files you own (unless you're logged in as 'root'). Since 'cpall' just runs 'cp' for all the files, it doesn't need to be tweaked; the same holds for 'mvall'.

Note that the original 'mv' has to be installed as SUID root (for reasons I don't recall off the top of my head - 'Use the Source, Luke'); this one has to be, too. To the best of my knowledge it is safe to use in a multi-user system; as in, when adding the changes, I tried to make sure they would work for normal users, etc, but... they haven't been carefully audited to make sure there aren't any security issues.


Better default input editing and interrupt characters

The default input editing and interrupt characters are, well, primeval. '@' for line delete? How 60s. (You can tell these guys were old
Multics hackers; they must have had the Multics line editing characters burned into their brains.)

You can change the editing characters with 'stty'. You can't, however, change the interrupt characters. That includes the use of DELETE for 'interrupt process', which is, ah, non-ideal by modern standards. So since we have to recompile things to change them, we might as well change the input editing characters too.

The interrupt characters are specified in sys/tty.h; change that, and then re-compile tty.c, which is where they are used. Remember to install it in the device library (../lib2) before you build a new Unix.

cc -c -O tty.c
ar r ../lib2/tty.o
rm tty.o

The input editing characters are used in kl.c and dz.c, but I'm not sure how long their setting of them lasts; I think those setting are over-ridden pretty quickly by getty, etc. To change the input editing characters, you need to change getty.c and login.c. The ones I use are here and here; getty goes in /etc, and login in /bin.

Note that I have added two entries to the terminal type table in getty; entry '3' is for pseudo-ttys (console, etc) which send DELETE from the BACKSPACE character on the standard Windoze keyboard, and '4' is for those (TELNET, etc) which send a BACKSPACE character from that key.

So in my /etc/ttys, tty8 (which is the Ersatz-11 console, a pseudo-VT100), is "183" (the '1' turns the line on); and the DZ lines (used for TELNET logins) are "1[a-h]4".


Advanced new (to V6) Unix tools

A number of useful tools which aren't in V6, but aren't trivial to add, are covered here. (Simple ones are covered
here and here.)


tar

The V7 'tar' is not too hard to get running under v6; the biggest problem is that it needs at least one system call which isn't in 'vanilla' V6 (at least, if you want it to operate the way it normally does); that is covered
above.

The code does use fseek(), which in the 'vanilla' Standard I/O Library uses the lseek() call (which also isn't in 'vanilla' V6 Unix), but you could use the alternative fseek() (here, above) instead of adding the lseek() system call.

Once you have them in, getting tar itself to work is pretty easy; mostly dealing with the fact that some system calls return different data in V6. The most problematic one is stat(), which returns the file size in a paired byte and shortword. chown() also takes a different number of arguments in V7; I hacked up a V7-compatible chown (available here - add it to the C library once compiled) to deal with that.

The other issue with tar() is that it uses the utime() system call - but it doesn't set the access time, just the modified time. So although I prepared a copy of the utime() system call for V6 (available here if you're curious, including a V7 version of the iupdat() internal system routine), I didn't need it: I just changed the code in tar to use the mdate() system call (the user form of smdate()).

The source to the V6ified 'ar' is here. You will also need the header files stdio.h, types.h, nstat.h, dir.h, and signal.h (available through the links).



strings

'strings' uses the ftell() call in the Standard I/O Library, so you have to add the
lseek() system call before it will work.

Having done that, V6 compatible source is here.


More useful new Unix tools

A few more new, interesting (well, to me :-) Unix tools.

si

Think of this ('system internals) as 'ps' on radioactive steroids. It shows you (depending on which options you specify) pretty much all the data inside the kernel: the mount table, the inode table, the file table, the text table, the disk buffer cache - you name it. There is a 'man' page for it
here.

It uses 'ncheck' for inode number -> file name mappings; it keeps the mappings in a file ("filenms") in the root directory of each disk pack, and recomputes them automagically whenever it looks like they are out of date. (The logic here is still not entirely complete.)

The source is available here, but as currently written it requires some minor hacks to the kernel (to get the system uptime, and also the current values in param.h - I got tired of having to recompile 'si' whenever I changed a parameter). The latter also includes a tweak to allow the running system's size to be found, to make sure that the symbol table in /unix applies to the running system.

So, you will also need:

The first two are very slightly tweaked to retain useful info ('diff' is your friend). The last goes in conf and has to be listed explicitly in the 'ld' command to build a new Unix - since it doesn't contain any unresolved externals, it won't load from a library. I.e.:
ld -x l.o m40.o c.o param.o ../lib1 ../lib2
mv a.out /nunix

This command is now too large to be compiled with the 'vanilla' V6 C compiler (gets symbol table overflows), so you have to either i) break it into two pieces (which I was too lazy to do), ii) compile it with the new C compiler (above), or iii) increase the size of the symbol table in the V6 compiler (which is not as hard as it sounds). To do the latter, edit c0h.c to change 'hshsiz' from 200 to (say) 400. Then re-compile and install. Or you can just download it here.

Also it needs some things I moved out to a private library (I called it libL.a, for 'Local') while I was still trying to make it fit the old compiler (before I just gave up :-):

I think that's all of them! Anyway, it's rather neat: give it a whirl. Something like:
si mfitV 5
in one TELNET window while you're working in another can be most interested. I was particularly amazed to find out (via the 'b' flag) that even with a large buffer cache, the cache is almost always entirely filled with blocks from the root device. I'm guessing this is because /bin/ is there; it would be interesting to move that to another device, and see what happens.


cmdate

This copies the 'last-modified date' from one file to another; useful if you're moving something around, and don't want to lose that information; the source is
here.

Note: First, it only works on systems that have had the smdate() system call added. Second, it wasn't written for use on a real time-sharing system; i.e. you have to be super-user to use it. And if you 'set-UID' it, anyone will be able to change the write dates on anyone else's files. Yes it would be easy to code so that it checks to see if the file owner is the same as the real UID, but... until someone really needs it, I have more interesting things to do! ;-)


Back to JNC's home page


© Copyright 2014, 2018, 2019-2020 by J. Noel Chiappa


Last updated: 18/September/2020