============
Introduction
============
In telecommunications applications, it is often desirable to have some
means of determining whether or not a stream of data has been transmitted
correctly.  One popular method for verifying transmissions is the
checksum - the sending system performs some mathematical computation on the
original message, producing a number, which is then transmitted along with
the message itself.  The receiving system then performs the same
computation on the received message, and compares its computed number with
the number that it received along with the message.  If the two numbers
don't match, then the message has been garbled somehow.  If the numbers do
match, the message is not guaranteed to be ungarbled, because it is
possible for a corrupted message to have the same checksum as an
uncorrupted message, but at least the likelihood of a transmission error
slipping though undetected is less when a checksum is used.

The trick, then, is to choose a checksum algorithm carefully, so that the
kinds of errors that usually occur in data transmissions (reversed bits,
dropped bits, added bits, etc.) will produce a significant change in the
checksum.  The cyclic redundancy checks (CRCs) are a popular class of
checksums which have this property.  Many different CRCs are possible, but
only a few of them seem to be in common use.  Two of the most common
CRCs are described below - one produces a 16-bit checksum, and the other
produces a 32-bit checksum.

A CRC is essentially binary long division -- the message to be checksummed
is divided by the CRC "polynomial," and the remainder is the CRC checksum.
The difference between the CRC "division" and ordinary division is that
when a CRC is computed, exclusive-OR's are used instead of subtractions.

Computing the long division for the entire data stream a bit at a time is
slow and inefficient.  Fortunately, it it not necessary to do it that way;
a considerable amount of time can be saved by pre-computing the CRCs of
the 256 possible 8-bit bytes. This allows the checksum to be computed a
byte at a time instead of a single bit at a time.

The price we pay for saving all that time is memory.  A block of memory must
be set aside somewhere for 256 pre-computed remainders.

==========
16-Bit CRC
==========
The following source code is a 6502 implementation of the traditional
16-bit CRC found in such applications as Xmodem, BinSCII, and Shrinkit.
Strictly speaking, I believe this implementation is "wrong" according to
the CRC standard (it uses the wrong bit order), but this "wrong" form seems
to have a long history of usage in a wide variety of software.

These routines, though original, were heavily inspired by a similar set of
routines by Andy Nicholas.  Source code for the Andy Nicholas routines can
be found in the File Type Note for file type $E0, aux type $8002 (Shrinkit
archives).

First, the initialization routine.  This should be called once before
calculating any CRCs.  It doesn't need to be called again unless the CRC
data tables get overwritten.

* INITCRC -- initialize CRC data table
* by Neil Parker
*
W       EQU 6               ;A scratchpad variable--can be anywhere in memory
*
INITCRC LDY #0              ;Y-reg contains byte to be CRC'd
L1      LDA #0              ;Init 2nd byte of dividend to 0
        STA W
        TYA                 ;Get CRC byte into 1st byte of dividend
        LDX #8              ;Loop over 8 bits
L2      ASL W               ;Shift dividend & catch hi bit in C
        ROL                 ;(some assemblers may require ROL A here)
        BCC L3              ;If 0, go do next bit...
        EOR #$10            ;...else EOR dividend with polynomial
        PHA
        LDA W
        EOR #$21
        STA W
        PLA
L3      DEX                 ;Done 8 bits yet?
        BNE L2              ;If not, loop...
        STA CRCTAB0,Y       ;...else save remainder in table
        LDA W
        STA CRCTAB1,Y
        INY                 ;Done 256 bytes yet?
        BNE L1              ;If not, loop...
        RTS                 ;...else we're done
*
CRCTAB0 DS  256             ;Data table for CRC remainders
CRCTAB1 DS  256

Once the CRC remainder table is set up, you're ready to calculate CRCs.  The
subroutine below performs the calculation.

* DOCRC -- accumulate a byte into CRC checksum
*
CRC0    EQU 7               ;1st byte of CRC checksum
CRC1    EQU 8               ;2nd byte of CRC checksum
*
DOCRC   EOR CRC0
        TAY
        LDA CRCTAB0,Y
        EOR CRC1
        STA CRC0
        LDA CRCTAB1,Y
        STA CRC1
        RTS

The exact steps for using these routines are as follows:
	Call INITCRC.
	Initialize the memory locations CRC0 and CRC1.  They should
        usually be initialized to 0, but not always - for example, one
        of the CRCs used in Shrinkit requires that they be initialized
        to $FF.
	For each byte of the message:
		Load the byte into the accumulator.
		Call DOCRC (which will scramble the A and Y registers).
	Get the checksum from CRC0 and CRC1.

Usually, CRC1 is transmitted first, followed by CRC0.

IIGS programmers will probably not be suprised to to learn that the
initialization routine is smaller and more efficient in native-mode 65816
code.

        longa   on
        longi   on
; Call with 16-bit M and X
InitCRC ldy     #$1FE		;Y-reg contains index into remainder table
L1      tya			;Convert index into byte to be CRC'd
        lsr     A		;Compensate for two-byte table entries
        xba			;CRC byte -> hi byte, 0 -> low byte
        ldx     #8		;Loop over 8 bits
L2      asl     A		;Catch hi bit in C flag
        bcc     L3		;If clear, go onto next bit...
        eor     #$1021		;...else EOR with CRC polynomial
L3      dex			;Done 8 bits yet?
        bne     L2		;If not, go do more bits...
        sta     CRCTab,Y	;...else store remainder
        dey
        dey			;Done 256 bytes yet?
        bpl     L1		;If not, go do more...
        rts			;...else return to caller
CRCTab  ds      512

The CRC calculation routine, however, does not benefit from 16-bit code.
This routine leaves the first byte of the checksum in CRC and the second
byte in CRC+1.

CRC     equ     0               ;(on direct page, for max efficiency)
        longa   on
        longi   on
; Call with any size M and X
DoCRC   php
        rep     #$30
        eor     <CRC+1
        and     #$FF
        asl     A
        tay
        sep     #$20
        longa   off
        lda     CRCTab+1,Y
        eor     <CRC
        sta     <CRC+1
        lda     CRCTab,Y
        sta     <CRC
        plp
        rts

These routines are used the same way the 8-bit versions are.  Do not,
however, mix the 8-bit and the 16-bit versions since the remainder
tables are incompatible.

==========
32-Bit CRC
==========
Computing a 32-bit CRC is similar.  The following routines compute the
CRC-32 used by Zmodem.  Unlike the 16-bit CRC, these routines are careful
about bit order, and should produce the same results as a hardware
implementation.

* INITC32 -- initialize CRC-32 data table
* by Neil Parker
*
W1      EQU 6		;Scratch variables--can go anywhere
W2      EQU 7
W3      EQU 8
*
INITC32 LDY #0		;Y-reg contains current byte to be CRC'd
L1      LDA #0		;Initialize remainder to 0
        STA W1
        STA W2
        STA W3
        TYA		;Get byte into A
        LDX #8		;Loop over 8 bits
L2      LSR W3		;Shift dividend and catch low bit in C flag
        ROR W2
        ROR W1
        ROR		;(Some assemblers may require ROR A here)
        BCC L3		;If 0, go on to next byte...
        EOR #$20	;...else EOR with CRC polynomial
        PHA
        LDA W1
        EOR #$83
        STA W1
        LDA W2
        EOR #$B8
        STA W2
        LDA W3
        EOR #$ED
        STA W3
        PLA
L3      DEX		;Done all 8 bits yet?
        BNE L2		;If not, loop...
        STA CRCTAB0,Y	;...else save remainder in table
        LDA W1
        STA CRCTAB1,Y
        LDA W2
        STA CRCTAB2,Y
        LDA W3
        STA CRCTAB3,Y
        INY		;Done all 256 bytes yet?
        BNE L1		;If not, loop...
        RTS		;...else we're done--return to caller
*
CRCTAB0 DS  256
CRCTAB1 DS  256
CRCTAB2 DS  256
CRCTAB3 DS  256

* DOCRC32 -- accumulate a byte into CRC-32 checksum
* by Neil Parker
*
CRC0    EQU 6
CRC1    EQU 7
CRC2    EQU 8
CRC3    EQU 9
*
DOCRC32 EOR CRC0
        TAY
        LDA CRCTAB0,Y
        EOR CRC1
        STA CRC0
        LDA CRCTAB1,Y
        EOR CRC2
        STA CRC1
        LDA CRCTAB2,Y
        EOR CRC3
        STA CRC2
        LDA CRCTAB3,Y
        STA CRC3
        RTS

The steps for using these routines are as follows:
	Call INITC32.
        Initialize CRC0 through CRC3 to $FF.
        For each byte of the message:
		Load the byte into the accumulator.
		Call DOCRC32 (which scrambles A and Y).
	EOR each of the bytes CRC0 through CRC3 with the constant $FF.
	Get the checksum from CRC0 through CRC3.

The checksum bytes are transmitted in the order CRC0, CRC1, CRC2, CRC3.

============
Bibliography
============
Andy Nicholas and Matt Deatherage
_Apple_II_File_Type_Note_ for file type $E0, auxiliary type $8002 (NuFile
Exchange Archival Library)