--- /dev/null
+Nick Downing downing.nick@gmail.com
+2.11BSD source tree modified for cross compilation from x86-64 linux
+
+Initial commit was an unmodified 2.11BSD source tree taken from a boot tape.
+
+Unpacked from http://www.tuhs.org/Archive/PDP-11/Distributions/ucb/2.11BSD:
+7263558ebfb676b4b9ddacc77da464c5 file7.tar.gz
+77397e6d554361c127592b1fea2d776f file8.tar.gz
+
+The root of this repository is "/usr/src" on a running 2.11BSD system.
+The boot tape file7.tar.gz provides "/usr/src/include" and "/usr/src/sys".
+The boot tape file8.tar.gz provides everything else in "/usr/src".
+
+To compile this repository, simply run "./n.sh" (this stands for "Nick"). It
+will clean the source tree, compile and install a cross toolchain into "cross",
+clean again, then compile and install a 2.11BSD filesystem tree into "stage".
+
+The cross toolchain installed into "cross" consists of the following files:
+ cross/bin/ar
+ cross/bin/as
+ cross/bin/cc
+ cross/bin/ld
+ cross/bin/nm
+ cross/bin/size
+ cross/lib/c0
+ cross/lib/c1
+ cross/lib/c2
+ cross/lib/cpp
+ cross/usr/bin/lorder
+ cross/usr/bin/mkdep
+ cross/usr/bin/ranlib
+ cross/usr/lib/libvmf.a
+ cross/usr/lib/libvmf_p.a
+ cross/usr/man/cat1/ar.0
+ cross/usr/man/cat1/ld.0
+ cross/usr/man/cat1/ranlib.0
+ cross/usr/man/cat3/vmf.0
+ cross/usr/man/cat5/ar.0
+ cross/usr/man/cat5/ranlib.0
+ cross/usr/ucb/strcompact
+ cross/usr/ucb/symcompact
+ cross/usr/ucb/symdump
+ cross/usr/ucb/symorder
+
+The cross toolchain is created by modifying the appropriate sources in this
+tree to compile under gcc for x86-64 linux. I created a compatibility header
+file called "krcompat.h", copied to various source directories, which contains
+definitions for prototypes and varargs functions. So we should have a common
+source for the toolchain which can compile under x86-64 linux and 2.11BSD.
+
+I have fixed or suppressed all compiler warnings. A great many of these are due
+to the K&R to ANSI conversion, I fixed them by introducing a standard way of
+declaring function headers, etc. Declarations in the original source code like
+ somefunc(a, p)
+ char *p;
+ {
+ register b;
+ ...
+ }
+become
+ int somefunc(a, p) int a; char *p; {
+ register int b;
+ ...
+ }
+and functions which do not return a value have "int" changed (by me) to "void".
+Note that "void" is apparently not K&R, it is "extended K&R". However, I have
+just used "void", and if this causes any problems later, I'll change to "VOID",
+which I can then suppress by means of a compatibility define in "krcompat.h".
+
+In the course of this, I have changed all function headers to one line, and I
+have changed parameters like "char *p, *q;" to "char *p; char *q;". This occurs
+because I use "cproto" and a VERY ROUGH script (not included in the repository)
+to convert them automatically. It is also because I prefer the source that way.
+
+The varargs conversion is pretty simple. The K&R code is first converted to use
+<varargs.h> rather than any ad-hoc convention it might have used before. Then,
+if __STDC__ is defined, it uses <stdarg.h> instead of <varargs.h>, and defines
+the function using an ANSI header rather than K&R. Example in "lib/ccom/c01.c":
+ #ifdef __STDC__
+ void werror(char *s, ...)
+ #else
+ void werror(s, va_alist) char *s; va_dcl
+ #endif
+ {
+ va_list ap;
+
+ if (Wflag)
+ return;
+ if (filename[0])
+ fprintf(stderr, "%s:", filename);
+ fprintf(stderr, "%d: warning: ", line);
+ va_start(ap, s);
+ vfprintf(stderr, s, ap);
+ va_end(ap);
+ fprintf(stderr, "\n");
+ }
+Note: With <varargs.h> it should be "va_start(ap)" but I hope the above is OK.
+
+In many cases the toolchain source code relied on running under 2.11BSD, e.g.
+the linker and symbol-related utilities are directly manipulating struct exec
+and struct nlist etc, and expecting the on-disk format to match the in-memory.
+To fix this I defined macros like OFF_T, INT, UNSIGNED_INT and whatever else
+was appropriate, which are the original types (off_t, int, unsigned int) on
+2.11BSD but compatibility types (int32_t, int16_t, uint16_t) on x86-64 linux.
+Then I intercepted disk read/writes to occur through a temporary buffer, e.g.:
+ #ifdef pdp11
+ fwrite(&stroff, sizeof (OFF_T), 1, fpin);
+ #else
+ temp[0] = (stroff >> 16) & 0xff;
+ temp[1] = (stroff >> 24) & 0xff;
+ temp[2] = stroff & 0xff;
+ temp[3] = (stroff >> 8) & 0xff;
+ fwrite(temp, sizeof (OFF_T), 1, fpin);
+ #endif
+This handles byte order issues, in particular the PDP-11 convention of storing
+a long with the high word first (but storing each word low byte first). Since
+x86-64 is little-endian like the PDP-11 there may be a few places which are not
+fully converted. In particular, the conversion of "ld", "ar" and "nm" is a bit
+rough, and I plan to go back and make it similar to "symcompact" and friends.
+I used a temporary buffer instead of an in-place conversion like htons() and
+friends, because I'm concerned about C's aliasing rules and gcc's optimizer.
+
+Another issue was the compiler second pass "/lib/c1" when it generates floating
+point constaints or performs constant folding. The host system uses IEEE-754,
+whereas the PDP-11 uses its own conventions. Since I want the cross toolchain
+to generate EXACTLY the same binaries as the traditional PDP-11 hosted tools,
+I had to take the floating-point emulation code from "simh" and put it in "c1".
+
+A further issue was the definition of struct nlist (and others) like this:
+ struct nlist {
+ union {
+ char *n_name; /* In memory address of symbol name */
+ OFF_T n_strx; /* String table offset (file) */
+ } n_un;
+ u_char n_type; /* Type of symbol - see below */
+ char n_ovly; /* Overlay number */
+ U_INT n_value; /* Symbol value */
+ };
+Unfortunately the n_name pointer breaks things on a 64-bit system because it is
+larger than OFF_T and causes sizeof(struct nlist) to be wrong. So I have made
+all the client programs that refer to this structure use n_strx exclusively,
+n_name is only defined when compiling for PDP-11. This was an easy change since
+there is always an associated string table so we just offset into it as needed.
+
+Since the above conversions tend to increase code bloat, and the PDP-11 tools
+are often running on the limit of memory, they do not apply when "pdp11" is
+defined, although in theory there should be no need to make this distinction.
+
+Since the host system is very similar to the target system, we can use tools
+like "/bin/sort", "/bin/sed" and the gnu "make" tool provided by the host,
+although it would also be possible to build cross versions of these tools. It
+is easier to use the native tools and work around occasional incompatibilities,
+for example I changed "sort -t/ +1" in a Makefile to simply "sort -t/" since
+the comparison of the first field was not going to change the outcome anyway.
+
+One problem with the host system being similar to the target system, is that
+when the cross tools include something like <a.out.h>, the host wants to
+provide its own version. This is rather delicate to work around, for the sake
+of minimal change I created a subdirectory called "include" alongside any
+affected sources, for example "bin/ld/ld.c" includes <a.out.h> so it gets a
+directory "bin/ld/include". In this directory there is a collection of links:
+ a.out.h -> ../../../include/a.out.h
+ ar.h -> ../../../include/ar.h
+ nlist.h -> ../../../include/nlist.h
+ ranlib.h -> ../../../include/ranlib.h
+ vmf.h -> ../../../include/vmf.h
+There is also a further directory "bin/ld/include/sys" containing this link:
+ exec.h -> ../../../../sys/h/exec.h
+This is a bit fragile since we cannot say whether the host might be trying to
+use its own <sys/exec.h> deep inside some other include file like <stdlib.h>,
+so it's not really the best solution to the problem. It would be better to have
+a define saying where to include the files from, and perhaps even alternative
+names like "struct nlist_211bsd" instead of "struct nlist", but this is rather
+bloated, and I got around the problem by doing the above hacks for the moment.
+
+To build the cross toolchain from a common source which can also build the
+PDP-11 hosted toolchain, my strategy was to change the Makefiles as little as
+possible. The most significant change is to modify "cc" to "${CC}" and so on,
+in many cases it was already like this, but I had to introduce aliases for all
+of the cross build tools, so "mkdep" becomes "${MKDEP}" and so on. The "make"
+tool included in 2.11BSD only provides "CC" and "AS" by default, so each
+Makefile is supposed to have lines like "MKDEP=/usr/bin/mkdep", so as not to
+break the PDP-11 hosted buildsystem. BUT, I suspect most of these are missing,
+since I have not tested the PDP-11 hosted buildsystem yet. I will fix it later.
+
+So the "make" commands to build the cross toolchain look something like this:
+ make CC="cc -Iinclude -Wall -Wno-char-subscripts -Wno-deprecated-declarations -Wno-format -Wno-maybe-uninitialized -Wno-parentheses -Wno-unused-result" CROSSDIR="/home/nick/src/211bsd.git/cross" STAGEDIR="/home/nick/src/211bsd.git/stage" SEPFLAG=
+ make DESTDIR="/home/nick/src/211bsd.git/stage" install
+The above example is taken from building "cross/bin/cc" since it needs to know
+both CROSSDIR and STAGEDIR so it is a good example of how I handle directories.
+The compiled "bin/cc" expects to find things in various places which are hard
+coded into the source, so I changed the Makefile to pass the CROSSDIR and the
+STAGEDIR into the compilation using defines, giving a compilation command like:
+ cc -Iinclude -Wall -Wno-char-subscripts -Wno-deprecated-declarations -Wno-format -Wno-maybe-uninitialized -Wno-parentheses -Wno-unused-result -DCROSSDIR="\"/home/nick/src/211bsd.git/cross\"" -DSTAGEDIR="\"/home/nick/src/211bsd.git/stage\"" -c -o cc.o cc.c
+Inside the source file "bin/cc/cc.c", I've adjusted the hard coded paths like:
+ char *cpp = CROSSDIR "/lib/cpp";
+ char *ccom = CROSSDIR "/lib/c0";
+ char *ccom1 = CROSSDIR "/lib/c1";
+ char *c2 = CROSSDIR "/lib/c2";
+ char *as = CROSSDIR "/bin/as";
+ char *ld = CROSSDIR "/bin/ld";
+ char *crt0 = STAGEDIR "/lib/crt0.o";
+This means before we can link, we'll have to build the C library into STAGEDIR.
+
+Since the "make" commands are hard to remember and inconvenient to type, and
+the top-level "n.sh" command is overkill since it builds everything and since
+it cleans instead of just rebuilding what has changed: I have put a smaller
+script "n.sh" in each directory I've visited, which gives the correct commands
+and installs the result (if it's a tool, it gets built as a cross tool and
+installed into CROSSDIR, otherwise it gets built for the target and installed
+into STAGEDIR). So it's easy to add debugging statements, compile with -g, etc.
+
+Most of the development work so far has gone into building and debugging the
+cross toolchain. We can't yet build all of the target, but we can build this:
+ stage/lib/crt0.o
+ stage/lib/libc.a
+ stage/lib/mcrt0.o
+ stage/include (copied from the source tree using "make install")
+ stage/unix
+ stage/usr/lib/libc_p.a
+ stage/usr/lib/libkern.a
+I have done some ad-hoc tests, and these files work correctly when copied to an
+existing 2.11BSD system under "simh". For instance, I can compile and run the
+"adventure" game using the above C startup code and libraries. I can boot the
+kernel. I have verified that the kernel and all its object files are binary
+identical to what the PDP-11 hosted build produces. I do not yet have a way of
+making "ar" and "ranlib" produce a binary identical copy of a library, so I
+have not definitively verified these are the same, but I see no problem so far.