LLX > Neil Parker > Apple II > Decimal Multiplication & Division

Multiplying and Dividing in Decimal Mode on the 6502

Introduction

Many introductions to 6502 programming describe how to do BCD (binary-coded decimal) addition and subtraction by turning on the decimal flag, and may mention the original 6502's decimal mode bug (the N, V, and Z flags aren't reliable after a decimal operation). But that's usually all they have to say on the subject...multiplication and division in decimal mode are never described, even if binary multiplication and division are.

After a bit of experimenting, I now understand the reason for this oversight: Decimal operations other than simple addition and subtraction are a pain in the butt on the 6502, requiring substantially more code than their binary equivalents. The routines end up being just too long for the average introductory chapter on machine language arithmetic.

But adventurous souls that we are, we'll forge ahead undaunted by such concerns. Decimal multiplication and division on the 6502 are not as difficult as some might think, despite the previous paragraph's pessimism.

Our Toolkit

We'll need a couple of tools to get the job done. First, we'll need to be able to multiply a decimal number by 2. We can't use a simple ASL or ROL for this, since the 6502's shift and rotate instructions always operate in binary mode regardless of the D flag, but we can do it easily by adding the number to itself:

        SED
        LDA NUMBER
        ADC NUMBER
        STA NUMBER
        CLD

Dividing a decimal number by 2 is a little harder, but we can do it with a shift and some post-processing on the result: If any decimal digit of the result is 8 or greater, subtract 3 from the digit. The BIT N08 line could be considered a trick--we need to test the $08 bit of the accumulator, and luckily we have a PHP instruction nearby, whose opcode happens to be $08. (On the 65C02 and 65816, one can say BIT #8 instead, eliminating the need for this trick, but for maximum portability this article uses only instructions that work on all 6502s.)

        ROR
N08     PHP
        BPL HALF2
        SEC
        SBC #$30
HALF2   BIT N08
        BEQ HALF3
        SEC
        SBC #3
HALF3   PLP

Note that this is done with the D flag off.

These two tools, together with decimal addition and subtraction, can be used to build routines for multiplying or dividing by any decimal number. Already, though, we can begin to see why the decimal routines are so long...both of the above tools do jobs that can be done with a single shift instruction in binary mode.

Multiplication

A reasonably obvious algorithm for decimal multiplication would be to implement the usual "grade school" method for multiplying with pencil and paper. On the plus side, this is reasonably straightforward for numbers of almost any size, but on the minus side, manipulating the individual digits of a BCD number is somewhat inconvenient, and we still need a subroutine to do decimal multiplication of single digits.

So instead, the routine below implements the so-called "Russian Peasant" algorithm, which goes like this:

  1. Write the two numbers to be multiplied side by side.
  2. Divide the left-hand number by two, dropping fractions, and write the answer underneath it. Then divide the new number by two again (dropping fractions), and write the result underneath it again. Repeat until you get down to 1.
  3. Double the right-hand number, and write the result underneath it, next to the first halved number in the left-hand column. Then double this new number again, and write the result under the previous number, and repeat until there are as many doubled numbers in the right hand column as there are halved numbers in the left-hand column.
  4. Find the numbers in the right-hand (doubling) column that are across from odd numbers in the left-hand (halving) column. Add up these numbers. The total is the product of the original two numbers.

For example, suppose we want to multiply 35 by 43:

          35     43 *
          17     86 *
           8    172
           4    344
           2    688
           1   1376 *

The starred numbers on the right are the ones opposite odd numbers on the left, so we add 43+86+1376, and get 1505, which is the correct answer.

This actually turns out to be identical to the grade school algorithm in base 2, but by doing it in terms of doublings and halvings instead of digits, we have a method that works without change in any base, and no auxiliary subroutine is needed to multiply single decimal digits.

By now it should be fairly obvious where we're going with this, so without further ado, let's talk about some actual code.

The routine below multiplies two one-byte (two-digit) decimal decimal numbers in memory locations ARG1 and ARG2, storing the result in memory locations RESULTL (low byte) and RESULTH (high byte). An additional memory location TEMP is also used temporarily as the high byte of ARG2. For maximum efficiency all these labels should be declared on page zero, but this is not required.

First we zero out the result and temporary space, and start the loop. We need only seven passes instead of the usual eight, because the largest two-digit decimal number is 99, which can be halved down to 0 in only seven halvings.

        LDA #0
        STA RESULTL
        STA RESULTH
        STA TEMP
        LDX #7

We see whether the first argument was odd by halving it. If a 1 bit falls off the end into the carry flag, the number was odd. This is actually a truncated version of our halving routine from above--we only need to check whether the low digit needs adjustment, because the high digit can never be greater than 4 after the LSR.

LOOP    LDA ARG1
        LSR
N08     PHP
        BIT N08
        BEQ PULLP
        SEC
        SBC #3
PULLP   PLP
        STA ARG1
        SED
        BCC DOUBLE

If the number was odd, we add the (shifted) second argument to the result. This block of code is skipped if the number was even.

        CLC
        LDA ARG2
        ADC RESULTL
        STA RESULTL
        LDA TEMP
        ADC RESULTH
        STA RESULTH

Now we double the second argument. Remember that decimal mode is on here, regardless of which path we took getting here.

DOUBLE  CLC
        LDA ARG2
        ADC ARG2
        STA ARG2
        LDA TEMP
        ADC TEMP
        STA TEMP
        CLD

We're at the end of the loop. Go back if there's more, else we're done.

        DEX
        BNE LOOP
        RTS

Division

Again, we avoid the usual grade school method for long division, and rely on a Russian-Peasant-style algorithm. Fortunately the Russian Peasant multiplication algorithm is easily reversible:

  1. Start by setting the quotient to 0, and the remainder to the dividend.
  2. If the divisor is 0, exit with an error. If the divisor is greater than the dividend, exit with no error.
  3. Repeatedly double the divisor until it's greater than the dividend. Let X be the number of doublings required.
  4. Repeat X times:
    1. Halve the divisor.
    2. Double the quotient.
    3. If the divisor is less than or equal to the remainder, subtract the divisor from the remainder, and add 1 to the quotient.

Again, if you do this in base 2, it's the same as the grade school method. If you look very carefully, you can see that it works by going backwards up the doubling column in the Russian Peasant algorithm, figuring out which entries in the halving column need to be odd by figuring out which doublings were added into the dividend.

The routine below will divide the two-byte (four-digit) number in NUML, NUMH by DENOML, DENOMH, leaving the quotient in QUOTL, QUOTH and the remainder in REML, REMH. All these labels should be declared in page zero for efficiency, but this not necessary.

First we initialize the quotient and remainder:

        LDA #0
        STA QUOTL
        STA QUOTH
        LDA NUML
        STA REML
        LDA NUMH
        STA REMH

Next we test for division by 0, and for the denominator greater than the numerator:

        LDA DENOML
        ORA DENOMH
        BNE TESTND
RET     RTS
TESTND  LDA NUML
        CMP DENOML
        LDA NUMH
        SBC DENOMH
        BCC RET

Now count how many doublings are required to make the denominator greater than the numerator. If the denominator overflows, assume it's greater than the numerator:

        LDX #0
COUNTD  INX
        SED
        CLC
        LDA DENOML
        ADC DENOML
        STA DENOML
        LDA DEMONH
        ADC DENOMH
        STA DENOMH
        CLD
        BCS LOOP
        LDA NUML
        CMP DENOML
        LDA NUMH
        SBC DENOMH
        BCS COUNTD

Halve the denominator. Watch out--the carry flag at the beginning of this code is important!

LOOP    LDA DENOMH
        JSR HALF
        STA DENOMH
        LDA DENOML
        JSR HALF
        STA DENOML

Double the quotient. Note that we can skip clearing the carry before the adds because DENOML, DENOMH is guaranteed to be an even number before the above halving step.

        SED
        LDA QUOTL
        ADC QUOTL
        STA QUOTL
        LDA QUOTH
        ADC QUOTH
        STA QUOTH

Try subtracting the denominator from the remainder:

        SEC
        LDA REML
        SBC DENOML
        TAY
        LDA REMH
        SBC DENOMH
        CLD
        BCC NEXTX

If the subtraction succeeded, replace the remainder with the result:

        STA REMH
        STY REML

...and add 1 to the quotient (since the low bit of the quotient is guaranteed to be 0 at this point, we don't need to go into decimal mode, or worry about a carry into the high byte):

        INC QUOTL

Count down the number of doublings, and go back if there were more, otherwise we're done. Note that the extra CLC instruction is needed to ensure that the carry flag is right at the label LOOP.

NEXTX   CLC
        DEX
        BNE LOOP
        RTS

Finally, we need our halving tool as a subroutine:

HALF    ROR
N08     PHP
        BPL HALF2
        SEC
        SBC #$30
HALF2   BIT N08
        BEQ PULLP
        SEC
        SBC #3
PULLP   PLP
        RTS

Conclusion

The above routines are certainly not the only way to do decimal multiplication and division. Multiplication can be implememted in a much smaller and easier-to-understand routine than the above simply by adding the multiplicand to itself repeatedly, as many times as the multiplier indicates. Likewise, a much simpler division routine would be to count how many times the denominator can be subtracted from the numerator. But multiplying that way is extremely inefficient if the multiplier is large, and dividing like that is slow if the denominator is much smaller than the numerator, and the problem becomes much worse if you need to work with numbers larger than one or two bytes. With the routines presented above, at least the worst case isn't much worse than the best case, and the methods scale much more nicely to wider operands.

Of course neither of the above routines will win any awards for speed. For maximum speed the best approach is probably to avoid algorithmic approaches like the above and rely on table lookups. For example, a very efficient multiplication routine can be written using the identity 4*A*B = (A+B)^2 - (A-B)^2, for which you only need to sacrifice about 1K of memory for a table of squares, and the code for it in decimal mode isn't much more complicated than the code in binary mode. But this article is already too long, so I leave implementing it as an exercise for the reader.

LLX > Neil Parker > Apple II > Decimal Multiplication & Division

Original: August 31, 2005
Modified: June 26, 2016--fixed a couple of typos