How to make a personal operating system ?

January 4, 2018

Comment from E-mail

I have emailed you before, and i got so excited that a major player in the computer world awnsered me. So here goes my question. My friend and I want to make our own computer software, We bought an old computer and scraped it, and made a new computer, but we want to make our own operating system, You said you made basic for the Apple 1 and II. What did you us to make it? what type of software, if any would we need to make it with, and is there any hard ware we might need?

Woz

I don’t know if these comments will apply, but here they are.

In high school I fell in love with minicomputers, which were basically small computers equivalent to the microprocessors once they came out. Well, I could never afford a minicomputer but I looked at the programming instructions for machine language and tried to write my own short routines. In college I started trying to figure out how compilers, like Fortran, were written. I knew that the compiler program had to read a line at a time and figure it out and convert it to code that the computer could run. So I started writing a routine in assembly language (machine language) that would analyze a line for correctness. But I never could afford a computer to try it on, nor an assembler to type it into. It was just a personal program that nobody else knew about.

After I’d designed a computer, before Apple was even conceived, I decided to write a BASIC for real. I’d never studied how to do this, but I had self trained myself a bit back in college as I described above. I’d never used BASIC but I knew that this was the popular language for games and that was too important to ignore. The first thing I did was get a BASIC manual at Hewlett Packard, where I was working. I read it and made notes and pretty much learned what commands it had. Of course this was Hewlett Packard BASIC. It differed from the Digital Equipment BASIC, that Bill Gates’ first BASIC was based on, mostly in some string manipulation. This later turned out to be the greatest difference between my BASIC and theirs. I was tired of MID$, LEFT$, RIGHT$ type functions so I preferred the HP BASIC better (A$(5,7) meant the 5th through 7th characters of A$).

I’d never formally educated myself in the area of compilers and interpreters (compilers translate a program to machine code to run rapidly later, interpreters scan the program and figure it out as it’s being run, which results in much slower execution–BASIC is an interpreted language). But I knew how Syntax charts defined the structure and words of a programming language, as you find these in programming manuals. I decided to write down a full Syntax description of my BASIC to begin. I’d never done such a thing, but it wasn’t hard and was modeled after others that I could find.

I next decided that I’d actually put this syntax list into memory as part of my BASIC interpreter. It was stored character by character. I figured that I’d just scan the input line, after the user hit Return, character by character, tracing a path through the syntax table and backing and retrying things. If the line made it through the Syntax table then it was good, otherwise it was in error.

The unexplainable part is how I came up with the way my BASIC would actually do what it was supposed to. As BASIC elements were found in the Syntax Table, I generated tokens (codes) for these elements. For example, a left parenthesis might generate token #87. But in another usage, a left parenthesis might generate token #115. It depended on where it was encountered in my Syntax table in memory, the one I was traversing character by character and matching the input line. In an inefficient effort to make my BASIC very tiny and save every possible byte (even the minimal amount of memory for a computer language was very expensive in 1975) I actually counted how many BASIC symbols the ‘matched’ one was from the start of the syntax table, and used that count as it’s token value.

After this step, I generated a line that didn’t have to go through the Syntax evaluator again. The Syntax evaluator could be very tiny and run slowly, as it only ran once per line, which took only a fraction of a second for a typical line. When the program ran, it was already half in shape for speed.

Now comes a less explainable part. I had read and heard some things about compilers but I still don’t know to this day if what I did was good or bad. As a line executed in a running program, it consisted of numbers (precompiled into constants I think) and variable names and grammar elements like a plus sign token or a left parenthesis token. When, during execution, the BASIC encountered a ‘noun’ (number or variable) it was pushed onto a noun stack, ready for retrieval. This was like our HP calculators where I worked.

When the BASIC encountered a ‘verb’ (a token that called for an operation) it would be evaluated in comparison to a verb stack. This was the way of reading a human-written expression from left to right, but doing the operations in a different order (2+3*4 does the multiplication first in most computer languages, even though the plus sign appears first). For each token I assigned 2 priorities. One was the priority to push preceding tokens off the stack for execution. For example, 3 + 7 * 5 would push 3 on the noun stack, + on the verb stack, then 7 on the noun stack (where it’s ready to be the first element removed from this first-in last-out stack). When the * is encountered, it had a higher execution priority than + so it didn’t pull the 7 and 3 off and add them yet. Instead it pushed the * onto the verb stack and then the 5 onto the noun stack. The end of line was a token with priority to push everything off.

So at this time the * is the ‘topmost’ token on the verb stack. It comes off and runs a prewritten multiply routine that pulls two items off the noun stack, adds them, and pushes the result back on that noun stack.

Any token that causes others to be executed immediately off the verb stack would keep looking at token priorities until it’s own priority was such that it would merely be pushed onto the verb stack and await later execution.

Parentheses bring another factor into play. A left parenthesis is always pushed onto the top of a verb stack, hiding the execution priority of the preceding operator token until a right parenthesis, with extremely high execution priority, causes all tokens to be executed until the left parenthesis is encountered. At that time the right parenthesis has found it’s mate and stops forcing ops (tokens) to execute. This is a sort of exception to the concept of a single priority. In addition, the left parenthesis forces no ops off the verb stack, acting as though it has extremely high priority. But no ops force it off, until the right parenthesis, as though it had an extremely low execution priority. So I actually had two priorities for each token, a ‘push’ tendency and a ‘pull’ tendency. A verb (token) would only push other verbs off the verb stack and execute them if it’s push priority was greater than their pull priority.

I have no idea where these sorts of ideas came from. They just came to me as I needed an elegant solution.

A table held bites with 2 one of zixteen priorities for each token that might be in the interpreted BASIC program. Another table held an address pointer for each token, that pointed to the routine to run when that token was forced to execute. So for each of the dozens of tokens, I had to only write a short routine. This kept the program less complicated and easier to add new commands to.

I couldn’t afford an assembler. I wrote the entire program on paper, assigning memory addresses for each program instruction. When I shortened a routine, it was too much trouble to re-write (by hand) a few K-Bytes of code just to shrink the space. So there were many cases of short empty spaces in my BASIC. When a routine needed to expand, I’d usually jump to a patch area where it’s latter part was. None of this would have happened if I could have afforded an assembler, which would have packed things properly.