SDCC vs z88dk: Comparing size and speed of the binaries generated for Amstrad CPC

Pincha aquí para verlo en español

In this tutorial we will compare the two most popular C compilers for Amstrad (from PC, of course). Let's compare both the size of the generated binary and its execution speed, doing various tests / measures different. At the time of this tutorial, the official versions of the two compilers are z88dk v1.9 and SDCC 3.1.0. I have tried to test several recent compilations z88dk (Nightly builds/snapshots) but I must say they are very unstable (at least for Amstrad CPC) crashing or generating unexpected results, so finally I decided to make the comparison with the official releases.

To make measurements of execution time we could use the firmware command KL TIME PLEASE (BD0D), but as we saw in the tutorial Measuring times and optimizing 2D Starfield (C with SDCC) it does not work on z88dk because the interrupts are always disabled when starting our program... For measures of speed of execution will use an emulator, using the video recording, then with a video editor we will measure time easily, for example, between two texts displayed on screen. Note: For sizes we will compare the files without including the Amsdos header.

First test, we created two unsigned 16-bit integer variables and we make a for of 65535 iterations, and displaying a text before and after the loop, the source code is exactly the same in SDCC and z88dk :

////////////////////////////////////////////////////////////////////////
// Test01sdcc.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
#include <stdio.h>

main()
{
  unsigned int nCounter = 0;
  unsigned int nLoops = 0;
  
  printf("Start\n\r");

  for(nCounter = 0; nCounter < 65535; nCounter++)
    nLoops++;

  printf("End %u\n\r", nLoops);
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////
// Test01z88dk.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
#include <stdio.h>

main()
{
  unsigned int nCounter = 0;
  unsigned int nLoops = 0;
  
  printf("Start\n\r");

  for(nCounter = 0; nCounter < 65535; nCounter++)
    nLoops++;

  printf("End %u\n\r", nLoops);
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////

We compile the program with both compilers as we have seen in previous tutorials :

SDCC (Test01sdcc.bat):
sdcc -mz80 --code-loc 0x0138 --data-loc 0 --no-std-crt0 crt0.rel putchar.rel Test01sdcc.c
hex2bin Test01sdcc.ihx
cpcdiskxp -File Test01sdcc.bin -AddAmsdosHeader 100 -AddToNewDsk Test01sdcc.dsk

z88dk (Test01z88dk.bat):
zcc +cpc -m -notemp -o Test01z88dk.bin Test01z88dk.c -create-app -lndos -Ca-v -O2
cpcdiskxp -File Test01z88dk.cpc -AddToNewDsk Test01z88dk.dsk

Both binaries produce the same output on the screen:

cpc

We recorded video with the emulator and compare the two executions:

  SDCC z88dk
Size 3616 bytes 1449 bytes
Speed 580 ms 4120 ms

The SDCC binary occupies more than double the z88dk but runs 7 times faster, or whatever it is, the binary z88dk occupies less than half that of SDCC but runs 7 times slower. Viewing these results so different in so simple program we can only say: What the hell is this! :-)

After analyzing the assembler and the memory map generated by each compiler we conclude that the great size difference in this case is given by the 'printf' of the C library, in this case it appears that z88dk has much more optimized (at least in size) than the SDCC one. To better compare our program with the code that generates only our source code, we will remove the 'printf' and we will replace with a simple call to the firmware command TXT_OUTPUT (BB5A):

////////////////////////////////////////////////////////////////////////
// Test02sdcc.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
main()
{
  unsigned int nCounter = 0;
  unsigned int nLoops = 0;
  
  __asm
    ld a, #83 ;'S'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;

  for(nCounter = 0; nCounter < 65535; nCounter++)
    nLoops++;

  __asm
    ld a, #69 ;'E'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////
// Test02z88dk.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
main()
{
  unsigned int nCounter = 0;
  unsigned int nLoops = 0;
  
  #asm
    ld a, 83 ;'S'
    call $BB5A ;TXT_OUTPUT
  #endasm

  for(nCounter = 0; nCounter < 65535; nCounter++)
    nLoops++;

  #asm
    ld a, 69 ;'E'
    call $BB5A ;TXT_OUTPUT
  #endasm
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////

Compile, run both programs and obtain the following results:

  SDCC z88dk
Size 83 bytes 153 bytes
Speed 580 ms 4120 ms

As can be seen, for this simple C program in just 20 lines, z88dk has generated a binary occupying almost twice that the generated by SDCC, in terms of speed, z88dk binary takes 7 times more than that generated by SDCC. In summary, in this particular example SDCC is 'dramatically' better than z88dk.

Looking at the assembler source code generated by both compilers, we clearly see the differences. We will analyze only the block generated by the loop:

for(nCounter = 0; nCounter < 65535; nCounter++)
    nLoops++;

SDCC:

  ld  de,#0x0000
  ld  bc,#0xFFFF
00106$:
  inc de
  dec bc
  ld  a,b
  or  a,c
  jr  NZ,00106$

z88dk:

  ld  hl,0  ;const
  pop de
  pop bc
  push  hl
  push  de
  jp  i_5
.i_3
  ld  hl,2  ;const
  add hl,sp
  push  hl
  call  l_gint  ;
  inc hl
  pop de
  call  l_pint
  dec hl
.i_5
  ld  hl,2  ;const
  add hl,sp
  call  l_gint  ;
  push  hl
  ld  hl,65535  ;const
  pop de
  call  l_ult
  jp  nc,i_4
  ld  hl,0  ;const
  add hl,sp
  push  hl
  call  l_gint  ;
  inc hl
  pop de
  call  l_pint
  dec hl
  jp  i_3
.i_4

There is a clear difference in size, the code generated by SDCC is straightforward, using a pair of 16bit registers (de and bc) and works directly with them (inc, dec) comparison and relative jump finally to meet the iterations. Looking at the code generated by z88dk we see a bit of a mess, hard to follow and the biggest problem is that it is continuously using the stack (push, pop) and external function calls incomprehensibly (l_gint, l_pint, l_ult) hence the enormous performance difference.

We will modify the program to use only 8-bit integers to see if there are changes in the results. Programs would be as follows:

////////////////////////////////////////////////////////////////////////
// Test03sdcc.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
main()
{
  unsigned char nCounter1 = 0;
  unsigned char nCounter2 = 0;
  unsigned char nLoops = 0;
  
  __asm
    ld a, #83 ;'S'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;

  for(nCounter1 = 0; nCounter1 <= 254; nCounter1++)
    for(nCounter2 = 0; nCounter2 <= 254; nCounter2++)
      nLoops++;

  __asm
    ld a, #69 ;'E'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////
// Test03z88dk.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
main()
{
  unsigned char nCounter1 = 0;
  unsigned char nCounter2 = 0;
  unsigned char nLoops = 0;
  
  #asm
    ld a, 83 ;'S'
    call $BB5A ;TXT_OUTPUT
  #endasm

  for(nCounter1 = 0; nCounter1 <= 254; nCounter1++)
    for(nCounter2 = 0; nCounter2 <= 254; nCounter2++)
      nLoops++;

  #asm
    ld a, 69 ;'E'
    call $BB5A ;TXT_OUTPUT
  #endasm
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////

Compile, run both programs and obtain the following results:

  SDCC z88dk
Size 91 bytes 228 bytes
Speed 440 ms 2960 ms

The results do not vary much compared to the previous test, the binary generated by SDCC occupies less than half the size z88dk generated, and it runs on almost one seventh of the time generated by z88dk. SDCC wins again in this test and by far.

We will modify a little the program to use a while loop and a variable 32-bit long, the programs would be as follows:

////////////////////////////////////////////////////////////////////////
// Test04sdcc.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
main()
{
  unsigned long nCounter = 0;
  
  __asm
    ld a, #83 ;'S'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;

  while(nCounter < 131070L)
    nCounter++;

  __asm
    ld a, #69 ;'E'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////
// Test04z88dk.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////
main()
{
  unsigned long nCounter = 0;
  
  #asm
    ld a, 83 ;'S'
    call $BB5A ;TXT_OUTPUT
  #endasm

  while(nCounter < 131070L)
    nCounter++;

  #asm
    ld a, 69 ;'E'
    call $BB5A ;TXT_OUTPUT
  #endasm
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////

Compile, run both programs and obtain the following results:

  SDCC z88dk
Size 105 bytes 251 bytes
Speed 2520 ms At 27000 ms the CPC resets or the screen is corrupted

It seems that z88dk must have 'broken' the support of long (32bit) variables and it is not able to compile / build this program correctly, Another negative point for z88dk.

Let's try now with an algorithm 'famous' as the sorting algorithm quicksort (source) and we will make ordering 960 16-bit integers. The source code would be as follows:

////////////////////////////////////////////////////////////////////////
// Test05sdcc.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////

void quicksort(int* data, int N)
{
  int i, j;
  int v, t;
 
  if( N <= 1 )
    return;
 
  // Partition elements
  v = data[0];
  i = 0;
  j = N;
  for(;;)
  {
    while(data[++i] < v && i < N) { }
    while(data[--j] > v) { }
    if( i >= j )
      break;
    t = data[i];
    data[i] = data[j];
    data[j] = t;
  }
  t = data[i-1];
  data[i-1] = data[0];
  data[0] = t;
  quicksort(data, i-1);
  quicksort(data+i, N-i);
}

const int aNumbers[960] = {
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    };

main()
{
  __asm
    ld a, #83 ;'S'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;

  quicksort(aNumbers, 960);

  __asm
    ld a, #69 ;'E'
    call #0xBB5A ;TXT_OUTPUT
  __endasm;

  while(1) {};
}
////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////
// Test05z88dk.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////

void quicksort(int* data, int N)
{
  int i, j;
  int v, t;
 
  if( N <= 1 )
    return;
 
  // Partition elements
  v = data[0];
  i = 0;
  j = N;
  for(;;)
  {
    while(data[++i] < v && i < N) { }
    while(data[--j] > v) { }
    if( i >= j )
      break;
    t = data[i];
    data[i] = data[j];
    data[j] = t;
  }
  t = data[i-1];
  data[i-1] = data[0];
  data[0] = t;
  quicksort(data, i-1);
  quicksort(data+i, N-i);
}

static int aNumbers[960] = {
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777,
                    };

main()
{
  #asm
    ld a, 83 ;'S'
    call $BB5A ;TXT_OUTPUT
  #endasm

  quicksort(aNumbers, 960);

  #asm
    ld a, 69 ;'E'
    call $BB5A ;TXT_OUTPUT
  #endasm
  
  while(1) {};
}
////////////////////////////////////////////////////////////////////////

Compile, run both programs and obtain the following results (this time, we subtract the 960 integers to the size to limit to the size of algorithm, not the size of the data):

  SDCC z88dk
Size 590 bytes 617 bytes
Speed 1840 ms 2620 ms

This time the difference in size is smaller, but still gaining SDCC. In terms of execution speed, SDCC takes 780 ms less in order the 960 numbers, or whatever it is, SDCC is 30% faster than z88dk.

As a final test, we will compile a program a little more complex and lengthy, the program draws on the screen several lines using the famous bresenham algorithm and the functions to draw pixels on screen of tutorial Painting pixels: Introduction to video memory (C with SDCC). The program is exactly the same for SDCC and z88dk, differing only in two lines of assembly code to set the mode 0:

////////////////////////////////////////////////////////////////////////
// Test06sdcc.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////

void SetMode0PixelColor(unsigned char *pByteAddress, unsigned char nColor, unsigned char nPixel)
{
  unsigned char nByte = *pByteAddress;

  if(nPixel == 0)
  {
    nByte &= 85;

    if(nColor & 1)
      nByte |= 128;

    if(nColor & 2)
      nByte |= 8;

    if(nColor & 4)
      nByte |= 32;

    if(nColor & 8)
      nByte |= 2;
  }
  else
  {
    nByte &= 170;

    if(nColor & 1)
      nByte |= 64;

    if(nColor & 2)
      nByte |= 4;

    if(nColor & 4)
      nByte |= 16;

    if(nColor & 8)
      nByte |= 1;
  }

  *pByteAddress = nByte;
}

void PutPixelMode0(unsigned char nX, unsigned char nY, unsigned char nColor)
{
  unsigned char nPixel = 0;
  unsigned int nAddress = 0xC000 + ((nY / 8) * 80) + ((nY % 8) * 2048) + (nX / 2);
  nPixel = nX % 2;

  SetMode0PixelColor((unsigned char *)nAddress, nColor, nPixel);
}

/**
 * Draws a line between two points p1(p1x,p1y) and p2(p2x,p2y).
 * This function is based on the Bresenham's line algorithm and is highly 
 * optimized to be able to draw lines very quickly. There is no floating point 
 * arithmetic nor multiplications and divisions involved. Only addition, 
 * subtraction and bit shifting are used. 
 *
 * Note that you have to define your own customized setPixel(x,y) function, 
 * which essentially lights a pixel on the screen.
 */
void swap(int n1, int n2)
{
  int nAux = n1;
  n1 = n2;
  n2 = nAux;
}
void lineBresenham(int p1x, int p1y, int p2x, int p2y)
{
    int F, x, y;
    int dy;
    int dx;
    int dy2;
    int dx2;
    int dy2_minus_dx2;
    int dy2_plus_dx2;

    if (p1x > p2x)  // Swap points if p1 is on the right of p2
    {
        swap(p1x, p2x);
        swap(p1y, p2y);
    }

    // Handle trivial cases separately for algorithm speed up.
    // Trivial case 1: m = +/-INF (Vertical line)
    if (p1x == p2x)
    {
        if (p1y > p2y)  // Swap y-coordinates if p1 is above p2
        {
            swap(p1y, p2y);
        }

        x = p1x;
        y = p1y;
        while (y <= p2y)
        {
            PutPixelMode0(x, y, 1);
            y++;
        }
        return;
    }
    // Trivial case 2: m = 0 (Horizontal line)
    else if (p1y == p2y)
    {
        x = p1x;
        y = p1y;

        while (x <= p2x)
        {
            PutPixelMode0(x, y, 1);
            x++;
        }
        return;
    }


    dy            = p2y - p1y;  // y-increment from p1 to p2
    dx            = p2x - p1x;  // x-increment from p1 to p2
    dy2           = (dy << 1);  // dy << 1 == 2*dy
    dx2           = (dx << 1);
    dy2_minus_dx2 = dy2 - dx2;  // precompute constant for speed up
    dy2_plus_dx2  = dy2 + dx2;


    if (dy >= 0)    // m >= 0
    {
        // Case 1: 0 <= m <= 1 (Original case)
        if (dy <= dx)   
        {
            F = dy2 - dx;    // initial F

            x = p1x;
            y = p1y;
            while (x <= p2x)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F += dy2;
                }
                else
                {
                    y++;
                    F += dy2_minus_dx2;
                }
                x++;
            }
        }
        // Case 2: 1 < m < INF (Mirror about y=x line
        // replace all dy by dx and dx by dy)
        else
        {
            F = dx2 - dy;    // initial F

            y = p1y;
            x = p1x;
            while (y <= p2y)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F += dx2;
                }
                else
                {
                    x++;
                    F -= dy2_minus_dx2;
                }
                y++;
            }
        }
    }
    else    // m < 0
    {
        // Case 3: -1 <= m < 0 (Mirror about x-axis, replace all dy by -dy)
        if (dx >= -dy)
        {
            F = -dy2 - dx;    // initial F

            x = p1x;
            y = p1y;
            while (x <= p2x)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F -= dy2;
                }
                else
                {
                    y--;
                    F -= dy2_plus_dx2;
                }
                x++;
            }
        }
        // Case 4: -INF < m < -1 (Mirror about x-axis and mirror 
        // about y=x line, replace all dx by -dy and dy by dx)
        else    
        {
            F = dx2 + dy;    // initial F

            y = p1y;
            x = p1x;
            while (y >= p2y)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F += dx2;
                }
                else
                {
                    x++;
                    F += dy2_plus_dx2;
                }
                y--;
            }
        }
    }
}

main()
{
  //SCR_SET_MODE 0
  __asm
    ld a, #0
    call #0xBC0E
  __endasm;

  lineBresenham(0, 0, 159, 199);
  lineBresenham(0, 199, 159, 0);
  lineBresenham(80, 0, 80, 199);
  lineBresenham(0, 100, 159, 100);

  while(1) {};
}
////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////
// Test06z88dk.c
// Mochilote - www.cpcmania.com
////////////////////////////////////////////////////////////////////////

void SetMode0PixelColor(unsigned char *pByteAddress, unsigned char nColor, unsigned char nPixel)
{
  unsigned char nByte = *pByteAddress;

  if(nPixel == 0)
  {
    nByte &= 85;

    if(nColor & 1)
      nByte |= 128;

    if(nColor & 2)
      nByte |= 8;

    if(nColor & 4)
      nByte |= 32;

    if(nColor & 8)
      nByte |= 2;
  }
  else
  {
    nByte &= 170;

    if(nColor & 1)
      nByte |= 64;

    if(nColor & 2)
      nByte |= 4;

    if(nColor & 4)
      nByte |= 16;

    if(nColor & 8)
      nByte |= 1;
  }

  *pByteAddress = nByte;
}

void PutPixelMode0(unsigned char nX, unsigned char nY, unsigned char nColor)
{
  unsigned char nPixel = 0;
  unsigned int nAddress = 0xC000 + ((nY / 8) * 80) + ((nY % 8) * 2048) + (nX / 2);
  nPixel = nX % 2;

  SetMode0PixelColor((unsigned char *)nAddress, nColor, nPixel);
}

/**
 * Draws a line between two points p1(p1x,p1y) and p2(p2x,p2y).
 * This function is based on the Bresenham's line algorithm and is highly 
 * optimized to be able to draw lines very quickly. There is no floating point 
 * arithmetic nor multiplications and divisions involved. Only addition, 
 * subtraction and bit shifting are used. 
 *
 * Note that you have to define your own customized setPixel(x,y) function, 
 * which essentially lights a pixel on the screen.
 */
void swap(int n1, int n2)
{
  int nAux = n1;
  n1 = n2;
  n2 = nAux;
}
void lineBresenham(int p1x, int p1y, int p2x, int p2y)
{
    int F, x, y;
    int dy;
    int dx;
    int dy2;
    int dx2;
    int dy2_minus_dx2;
    int dy2_plus_dx2;

    if (p1x > p2x)  // Swap points if p1 is on the right of p2
    {
        swap(p1x, p2x);
        swap(p1y, p2y);
    }

    // Handle trivial cases separately for algorithm speed up.
    // Trivial case 1: m = +/-INF (Vertical line)
    if (p1x == p2x)
    {
        if (p1y > p2y)  // Swap y-coordinates if p1 is above p2
        {
            swap(p1y, p2y);
        }

        x = p1x;
        y = p1y;
        while (y <= p2y)
        {
            PutPixelMode0(x, y, 1);
            y++;
        }
        return;
    }
    // Trivial case 2: m = 0 (Horizontal line)
    else if (p1y == p2y)
    {
        x = p1x;
        y = p1y;

        while (x <= p2x)
        {
            PutPixelMode0(x, y, 1);
            x++;
        }
        return;
    }


    dy            = p2y - p1y;  // y-increment from p1 to p2
    dx            = p2x - p1x;  // x-increment from p1 to p2
    dy2           = (dy << 1);  // dy << 1 == 2*dy
    dx2           = (dx << 1);
    dy2_minus_dx2 = dy2 - dx2;  // precompute constant for speed up
    dy2_plus_dx2  = dy2 + dx2;


    if (dy >= 0)    // m >= 0
    {
        // Case 1: 0 <= m <= 1 (Original case)
        if (dy <= dx)   
        {
            F = dy2 - dx;    // initial F

            x = p1x;
            y = p1y;
            while (x <= p2x)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F += dy2;
                }
                else
                {
                    y++;
                    F += dy2_minus_dx2;
                }
                x++;
            }
        }
        // Case 2: 1 < m < INF (Mirror about y=x line
        // replace all dy by dx and dx by dy)
        else
        {
            F = dx2 - dy;    // initial F

            y = p1y;
            x = p1x;
            while (y <= p2y)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F += dx2;
                }
                else
                {
                    x++;
                    F -= dy2_minus_dx2;
                }
                y++;
            }
        }
    }
    else    // m < 0
    {
        // Case 3: -1 <= m < 0 (Mirror about x-axis, replace all dy by -dy)
        if (dx >= -dy)
        {
            F = -dy2 - dx;    // initial F

            x = p1x;
            y = p1y;
            while (x <= p2x)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F -= dy2;
                }
                else
                {
                    y--;
                    F -= dy2_plus_dx2;
                }
                x++;
            }
        }
        // Case 4: -INF < m < -1 (Mirror about x-axis and mirror 
        // about y=x line, replace all dx by -dy and dy by dx)
        else    
        {
            F = dx2 + dy;    // initial F

            y = p1y;
            x = p1x;
            while (y >= p2y)
            {
                PutPixelMode0(x, y, 1);
                if (F <= 0)
                {
                    F += dx2;
                }
                else
                {
                    x++;
                    F += dy2_plus_dx2;
                }
                y--;
            }
        }
    }
}

main()
{
  //SCR_SET_MODE 0
  #asm
    ld a, 0
    call $BC0E
  #endasm

  lineBresenham(0, 0, 159, 199);
  lineBresenham(0, 199, 159, 0);
  lineBresenham(80, 0, 80, 199);
  lineBresenham(0, 100, 159, 100);

  while(1) {};
}
////////////////////////////////////////////////////////////////////////

Compile, run both programs and obtain the following results:

  SDCC z88dk
Size 1546 bytes 2091 bytes
Speed 320 ms 1960 ms

SDCC wins again with an incredible difference, SDCC binary runs 6 times faster than z88dk. As a picture is worth a thousand words, here's a visual comparison:

SDCC z88dk
sdcc z88dk

As z88dk v1.9 version is of 2009 I decided to try to compile the latest example with a recent beta to see if they have improved somewhat in recent years, concretely I used the beta of the day 05/19/2012, these are the results:

  SDCC v3.1.0 z88dk v1.9 z88dk 2012-05-19
Size 1546 bytes 2091 bytes 2050 bytes
Speed 320 ms 1960 ms 2120 ms

It has not improved at all...

 

UPDATE: The 07/09/2012 has been released the version 3.2.0 of SDCC, let's see if things have improved or worsened:

  SDCC v3.1.0 z88dk v1.9 z88dk 2012-05-19 SDCC v3.2.0
Size 1546 bytes 2091 bytes 2050 bytes 1497 bytes
Speed 320 ms 1960 ms 2120 ms 260 ms

Has improved even more!

UPDATE: The 11/06/2012 has been released the version 1.10 of z88dk, let's see if things have improved or worsened:

  SDCC v3.1.0 z88dk v1.9 z88dk 2012-05-19 SDCC v3.2.0 z88dk v1.10
Size 1546 bytes 2091 bytes 2050 bytes 1497 bytes 2178 bytes
Speed 320 ms 1960 ms 2120 ms 260 ms 2100ms

As we see, with the new version of z88dk, not much has changed from version v1.9, the results are still painful comparing with SDCC.

UPDATE: The 05/20/2013 has been released the version 3.3.0 of SDCC, let's see if things have improved or worsened:

  SDCC v3.1.0 z88dk v1.9 z88dk 2012-05-19 SDCC v3.2.0 z88dk v1.10 SDCC v3.3.0
Size 1546 bytes 2091 bytes 2050 bytes 1497 bytes 2178 bytes 1427 bytes
Speed 320 ms 1960 ms 2120 ms 260 ms 2100ms 300 ms

Conclusions:

  • Stability:
    1. z88dk failed (running) when using 32-bit variables, leaving hanging the cpc.
    2. Tests with beta versions of z88dk are very unstable.
    3. Interrupts are always disabled in z88dk when starting our program, causing some firmware features may not work correctly as KL TIME PLEASE (BD0D), This bug has already been notified and accepted by the z88dk development team and behaves differently in version 1.8, 1.9 and the current betas.
    4. There has been no problem with compilations of SDCC and interrupts are enabled when starting our program and we can use the system clock.
  • Speed:
    1. SDCC has generated faster code in all cases, up to 6 and 7 times faster that generated by z88dk.
  • Size:
    1. SDCC has generated code smaller in almost all tests.
    2. z88dk seems to have more optimized (in size) the standard library functions of c (at least printf).

In view of these results so overwhelming, I finally stopped using z88dk:

Do not waste your time with z88dk (CPU cycles :-)), long life to SDCC!!

You could download a zip with all files (source code, bat to compile, binary and dsk's) here: SDCC_vs_z88dk.zip

 

www.CPCMania.com 2012