Overview of How the `dc.sed` Script Works

Greg Ubben, 5 March 2001

[dc.sed is an arbitrary precision and radix RPN calculator written in the sed stream-editor "language". It was posted to the "seders" mailing list in February 1997. I assume knowledge of the Unix dc command and regular expressions here.]

As you probably know, sed only has one long text buffer (also called the pattern space), plus one alternate buffer (the hold space). This buffer is usually limited to several thousand characters on traditional Unix versions of sed. Between operations, the numbers are stored in this buffer in normal decimal format. The buffer is partitioned into various parts using | and ~ as delimiters so that any amount of numbers can be kept on the stack and also in various dc register stacks. The actually format of the buffer is:

    <stack>|P|K0|I10|O10|ra<stack>|rx<stack>|rZ<stack>|?<input-stack>

where <stack> is a sequence of 0 or more numbers or strings, each followed by a tilde (~). (The stack items cannot contain | or ~.) The first stack is the working stack, |P is any partial output line (which we need to buffer since sed can't print a partial line), |K is the current scale, |I the current input base, |O the current output base, |rx is register x, and |? is the input stack. For example, if pushing 365 and -3.14 on the stack and storing 66411 in register b, you would have these before and after buffer contents:

    |P|K0|I10|O10|? 365 _3.14 66411sb
    -3.14~365~|P|K0|I10|O10|rb66411~|?

To summarize how the calculations are done, I guess you could say the core routines temporarily convert the working digits to "analog" format (e.g., 6=aaaaaa) where regular expressions can work with them easier. And the other routines are implemented in high-level dc code that is "boot-strapped" from sed. sed's hold space is used to store the rest of the buffer during low-level operations so only the part you're currently working with is present.

There are two core routines: addition/subtraction and multiplication. Almost everything else is built on top of these. You could say the add/sub uses a "number line" to do the math. After much frustration (one of the reasons I stopped on it for over a year (the other reason being once I saw the whole thing could be done, I lost interest in it a bit)), I was able to come up with an algorithm that had enough similarity in the addition and subtraction that I could use the same code for both, but just starting with a different "lookup table". I needed this because I wanted to stay within the 199-command limit that traditional Unix sed implementations have. (I kept track of how close I was to this command limit all during development, and ended up with none to spare, though I've since identified several places I could crunch out room for a few more if I had to.)

As I recall, the add/sub first starts out by normalizing the two numbers a bit, reducing the number of cases. E.g., -5 - -8 and 8 + -5 are both the same as 8 - 5. And it handles dc's comparison (<,=,>) functions at this point, since a comparison needs to be done here anyway to determine which order to subtract the numbers in (since my combined add/sub algorithm requires the smaller absolute value to be subtracted from the larger one). To compare, it starts at the decimal point (or right end if no decimal) of each number and works to the left until it reaches the left end of one of the numbers. If there's any left over (one number is longer), that's the larger one.

Otherwise it uses an interesting technique (originally used in my sort.sed) to find the first digit that differs in the two numbers and compare which is greater. The technique is basically to append a "lookup table" to the buffer. For example, if "38;0123456789" can be matched by the (extended) regular expression /^(.)(.).*\1.*\2/, that means the first two digits (3 and 8) are in ascending order, as it found them in the lookup table in that order. Except this regular expression is combined with the part that skips the leading digits the two numbers have in common (which comes out pretty slick using the powerful back- referencing and backtracking features of regexes).

This same lookup-table technique is used in quite a few places in the script. It really cuts down on the number of sed commands since you just need two s/// commands -- one to append the lookup table, and a complex one to do the lookup and replace (removing the table at the same time). For example, if the sed buffer contained a single digit (say 7) and I wanted to convert it to analog, I'd first append a lookup table to it like this: s/$/9876543210aaaaaaaaa/ Then the regular expression just needs to find the digit in the lookup table (using a back-reference), skip 9 characters further, and the rest of the a's after that will be the result. Here is the substitute command to do this (in extended regular expression syntax to avoid all the ugly backslashes): s/(.).*\1.{9}// This leaves just aaaaaaa in the buffer if the first digit was a 7. The conversion back works pretty much the same way:

                # start with:    aaaaaaa
    s/$/9876543210/     # append table:  aaaaaaa9876543210
    s/.{9}(.).*/\1/     # final result:  7

Anyway, getting back to the add/sub, it then starts back at the decimal point and iterates to the right through the factions of the two numbers until it finds the number with the shorter fraction. This is because you have to start adding/subtracting on the right side at the same decimal position. (An explicit iteration loop is needed because a regular expression can't say "take the length of this string of characters here and find the same number of characters at a point later on in the string" (unless they are all the same character).) The result gets initialized with any extra fraction to the right. For example, in 3.14159 + 36.75, the result gets initialized with the 159. Finally, the main add/sub loop starts working, adding or subtracting one digit at a time leftwards (depending on which lookup table was appended earlier).

It uses what you could call a "dynamic double" lookup trick to add/sub the two digits. Basically, if you're adding the digits 5 and 8, it looks both up (at the same time) in the double lookup table "9876543210;9876543210" and gets the remaining digits for each following that position. So you'd get 43210 and 76543210 for 5 and 8. Then these are concatenated with another fixed table to get 43210765432109876543210. If you then start at the left side of that and count 10 characters in, you get the one's digit sum (3). And if there are more than 10 digits remaining after that, it means you have a carry. A carry is just used to make the above table one character longer on the next iteration (since a longer left side results in a higher digit being looked up).

Fortunately I was able to find an algorithm and format for the tables that allowed the same code to work the same way for subtraction -- but just using "9876543210;0123456789" for the first table instead. It was one of several eureka moments, as I found the borrow was able to work the exact same way as the carry did for addition. The add/sub was probably the hardest, most frustrating part of the whole script since there were so many ways to go. And it's also the ugliest part of the code (if you could call any of it pretty :).

The multiplication routine starts with a loop that creates a product template (e.g., position of decimal point) based on the current scale and the scale (number of fraction digits) in each factor. I found a technique in Knuth for multiplying from right to left getting a product digit at a time, instead of keeping all the intermediate products as you do on paper. It's just multiplying the individual digits in a different order allowing you to progress a column at a time leftwards instead of a row at a time -- nothing earth-shattering. Then to multiply two digits, say 3*5, you just generate an analog string of length 3 (the first factor) using the lookup-table trick and repeat it 10 times to get this intermediate lookup table:

    9aaa8aaa7aaa6aaa5aaa4aaa3aaa2aaa1aaa0

Then you find the other factor (5) in this table and take all the a's past that point. There will be 15 a's left (3*5). These digit products are kept in a reduced analog format where, for example, the number 427 would be represented as ccccbbaaaaaaa. This allows each digit product to be efficiently added to the next. You just concatenate the a's from the next, if there are more than 10 you turn the first 10 into a b, and if there are more than 10 b's you turn the first 10 b's into a c. When done adding all the digit products for the current column, you turn the a's (one's place) back into a digit using a lookup table and put it in the product. And the remaining b's and c's (tens and hundreds) are carried over to be included in the next column as a's and b's (ones and tens). When there are no a's or b's remaining carried over, your product is complete.

For division, Knuth was a real life-saver. He covered something called Newton's 2nd-order reciprocal method where you can calculate the reciprocal of any number (n) by taking an initial guess (g) and then repeatedly doing g = 2g - ngg on it until it stabilizes. (And the multiplication and subtraction routines needed for this are already available.) Then you just multiply by this reciprocal when done. The sed loop in the divide routine is just used to get an initial guess that's in the right ball park. Basically, if you have an n>1 that has 30 digits to the left of the decimal point, the initial guess is .000...0001 with 30 zeros to the right of the decimal point. And vice versa. At this point control is passed to dc commands pushed on the input stack to compute the quotient using Newton's method.

Likewise, I used Newton's square root algorithm too. I first use sed to generate an initial guess (g) by simply deleting half the digits in the number n (or half the leading zeros if n < 1). Then boot-strap to dc code which repeatedly does g = (g + n/g) / 2 until g stops decreasing. Amazingly, the very first test of this routine (8k2v) was right on the mark even though the square root and divide routines were still buggy. A testament to the reliable convergence of Newton's algorithm I think. Ken Pizzini (GNU sed POC) later found a precision glitch in certain cases when the scale is 0 (e.g., 16v, 224v) that I corrected in the Perl version and in the 1.1 version.

I could have saved a few sed commands by implementing the exponential function totally in dc using a binary algorithm. But I already had a version coded in sed that calculates the power using an interesting decimal lookup-table method (where what you're looking up is actually the dc code needed to raise a number to a given power from 0 to 9), and didn't want to throw all that work out. In this algorithm, x^736 is computed as ((x^7) ^ 10 * x^3) ^ 10 * x^6, which averages 7.1 multiplies per digit. One of the hardest parts of this function was figuring out the algorithm to calculate the precision of the result correctly (something GNU dc didn't get right (at the time)). That's what a lot of the dc code here is doing. As in most places, not being a mathematician, I figured this out empirically by just writing a bunch of examples on paper and seeing what precision was needed in the various cases.

When I started out on the dc.sed adventure, I never planned to include the radix conversions, thinking that part was just too complex and I'd never be able to fit it in the 199-command sed limit even if it were possible. But completing the above fundamental routines freed up my mind to rethink this. I found that the same techniques of boot- strapping most of the work up to higher-level dc code could be used effectively here too. The routine to convert non-decimal input to decimal required some initial lookup-table sed code to generate the dc commands used to do the actual conversion. But it turned out the opposite conversion to a non-decimal output base could be done *entirely* in dc -- albeit a 6-line-long dc incantation. (Commented dc source for this is available here.) The first test of this code was 100o _1234567.1234567p (at 1:43am Feb 2, 1997 :), and surprisingly worked flawlessly.

A few words on the other routines, for completeness... The string input routine is mainly matching [arbitrarily-nested [brackets]] to find the end of the string, which may involve reading multiple lines of input. I added array routines implemented in boot-strapped dc, but later thought they might actually be better implemented in sed instead. One of my favorite sections is the :count routine which is used for the three dc functions that need to count things. I optimized quite a bit of code out of here and am pretty sure it can't be reduced even one character further. It uses the same basic analog counting method as is used in the middle of the multiply routine. The register routines were one of the first things added to dc.sed, and are really what got me interested when I envisioned how they could be implemented in sed.

The :break routine (Q command) involves a decrement counter to remove n items from the input stack, using yet another lookup table algorithm. (The :trunc and multiplication precision routines also involve loops controlled by decrement counters that use the same algorithm. :trunc, incidentally, is a special dc function that is kind of the opposite of the boot-strapping-dc technique -- I used it in several places to "call back" to sed to do a function that couldn't be done in dc.)

dc.sed was all designed and coded on scratch paper. Although to a computer scientist it may seem weird doing math using only string manipulations, I think it's best compared to the problem of designing the mechanical calculators (some of which were sophisticated enough to include square root functions). Though I didn't have to worry about inertia. And besides the enjoyment of the hack challenge, I can think of one reusable part: the boot-strapping-dc technique would actually be practical to simplify a C implementation of dc.

Overview of How the dc.sed Script Works

Greg Ubben, 5 March 2001

Overview of How the `dc.sed` Script Works