SDCC vs z88dk: Comparing size and speed of the binaries generated for Amstrad CPC Pincha aquí para verlo en español In this tutorial we will compare the two most popular C compilers for Amstrad (from PC, of course). Let's compare both the size of the generated binary and its execution speed, doing various tests / measures different. At the time of this tutorial, the official versions of the two compilers are z88dk v1.9 and SDCC 3.1.0. I have tried to test several recent compilations z88dk (Nightly builds/snapshots) but I must say they are very unstable (at least for Amstrad CPC) crashing or generating unexpected results, so finally I decided to make the comparison with the official releases. To make measurements of execution time we could use the firmware command KL TIME PLEASE (BD0D), but as we saw in the tutorial Measuring times and optimizing 2D Starfield (C with SDCC) it does not work on z88dk because the interrupts are always disabled when starting our program... For measures of speed of execution will use an emulator, using the video recording, then with a video editor we will measure time easily, for example, between two texts displayed on screen. Note: For sizes we will compare the files without including the Amsdos header. First test, we created two unsigned 16-bit integer variables and we make a for of 65535 iterations, and displaying a text before and after the loop, the source code is exactly the same in SDCC and z88dk : //////////////////////////////////////////////////////////////////////// // Test01sdcc.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// #include <stdio.h> main() { unsigned int nCounter = 0; unsigned int nLoops = 0; printf("Start\n\r"); for(nCounter = 0; nCounter < 65535; nCounter++) nLoops++; printf("End %u\n\r", nLoops); while(1) {}; } //////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////// // Test01z88dk.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// #include <stdio.h> main() { unsigned int nCounter = 0; unsigned int nLoops = 0; printf("Start\n\r"); for(nCounter = 0; nCounter < 65535; nCounter++) nLoops++; printf("End %u\n\r", nLoops); while(1) {}; } //////////////////////////////////////////////////////////////////////// We compile the program with both compilers as we have seen in previous tutorials : SDCC (Test01sdcc.bat): z88dk (Test01z88dk.bat): Both binaries produce the same output on the screen: We recorded video with the emulator and compare the two executions:
The SDCC binary occupies more than double the z88dk but runs 7 times faster, or whatever it is, the binary z88dk occupies less than half that of SDCC but runs 7 times slower. Viewing these results so different in so simple program we can only say: What the hell is this! :-) After analyzing the assembler and the memory map generated by each compiler we conclude that the great size difference in this case is given by the 'printf' of the C library, in this case it appears that z88dk has much more optimized (at least in size) than the SDCC one. To better compare our program with the code that generates only our source code, we will remove the 'printf' and we will replace with a simple call to the firmware command TXT_OUTPUT (BB5A): //////////////////////////////////////////////////////////////////////// // Test02sdcc.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// main() { unsigned int nCounter = 0; unsigned int nLoops = 0; __asm ld a, #83 ;'S' call #0xBB5A ;TXT_OUTPUT __endasm; for(nCounter = 0; nCounter < 65535; nCounter++) nLoops++; __asm ld a, #69 ;'E' call #0xBB5A ;TXT_OUTPUT __endasm; while(1) {}; } //////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////// // Test02z88dk.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// main() { unsigned int nCounter = 0; unsigned int nLoops = 0; #asm ld a, 83 ;'S' call $BB5A ;TXT_OUTPUT #endasm for(nCounter = 0; nCounter < 65535; nCounter++) nLoops++; #asm ld a, 69 ;'E' call $BB5A ;TXT_OUTPUT #endasm while(1) {}; } //////////////////////////////////////////////////////////////////////// Compile, run both programs and obtain the following results:
As can be seen, for this simple C program in just 20 lines, z88dk has generated a binary occupying almost twice that the generated by SDCC, in terms of speed, z88dk binary takes 7 times more than that generated by SDCC. In summary, in this particular example SDCC is 'dramatically' better than z88dk. Looking at the assembler source code generated by both compilers, we clearly see the differences. We will analyze only the block generated by the loop: for(nCounter = 0; nCounter < 65535; nCounter++) nLoops++; SDCC: ld de,#0x0000 ld bc,#0xFFFF 00106$: inc de dec bc ld a,b or a,c jr NZ,00106$ z88dk: ld hl,0 ;const pop de pop bc push hl push de jp i_5 .i_3 ld hl,2 ;const add hl,sp push hl call l_gint ; inc hl pop de call l_pint dec hl .i_5 ld hl,2 ;const add hl,sp call l_gint ; push hl ld hl,65535 ;const pop de call l_ult jp nc,i_4 ld hl,0 ;const add hl,sp push hl call l_gint ; inc hl pop de call l_pint dec hl jp i_3 .i_4 There is a clear difference in size, the code generated by SDCC is straightforward, using a pair of 16bit registers (de and bc) and works directly with them (inc, dec) comparison and relative jump finally to meet the iterations. Looking at the code generated by z88dk we see a bit of a mess, hard to follow and the biggest problem is that it is continuously using the stack (push, pop) and external function calls incomprehensibly (l_gint, l_pint, l_ult) hence the enormous performance difference. We will modify the program to use only 8-bit integers to see if there are changes in the results. Programs would be as follows: //////////////////////////////////////////////////////////////////////// // Test03sdcc.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// main() { unsigned char nCounter1 = 0; unsigned char nCounter2 = 0; unsigned char nLoops = 0; __asm ld a, #83 ;'S' call #0xBB5A ;TXT_OUTPUT __endasm; for(nCounter1 = 0; nCounter1 <= 254; nCounter1++) for(nCounter2 = 0; nCounter2 <= 254; nCounter2++) nLoops++; __asm ld a, #69 ;'E' call #0xBB5A ;TXT_OUTPUT __endasm; while(1) {}; } //////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////// // Test03z88dk.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// main() { unsigned char nCounter1 = 0; unsigned char nCounter2 = 0; unsigned char nLoops = 0; #asm ld a, 83 ;'S' call $BB5A ;TXT_OUTPUT #endasm for(nCounter1 = 0; nCounter1 <= 254; nCounter1++) for(nCounter2 = 0; nCounter2 <= 254; nCounter2++) nLoops++; #asm ld a, 69 ;'E' call $BB5A ;TXT_OUTPUT #endasm while(1) {}; } //////////////////////////////////////////////////////////////////////// Compile, run both programs and obtain the following results:
The results do not vary much compared to the previous test, the binary generated by SDCC occupies less than half the size z88dk generated, and it runs on almost one seventh of the time generated by z88dk. SDCC wins again in this test and by far. We will modify a little the program to use a while loop and a variable 32-bit long, the programs would be as follows: //////////////////////////////////////////////////////////////////////// // Test04sdcc.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// main() { unsigned long nCounter = 0; __asm ld a, #83 ;'S' call #0xBB5A ;TXT_OUTPUT __endasm; while(nCounter < 131070L) nCounter++; __asm ld a, #69 ;'E' call #0xBB5A ;TXT_OUTPUT __endasm; while(1) {}; } //////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////// // Test04z88dk.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// main() { unsigned long nCounter = 0; #asm ld a, 83 ;'S' call $BB5A ;TXT_OUTPUT #endasm while(nCounter < 131070L) nCounter++; #asm ld a, 69 ;'E' call $BB5A ;TXT_OUTPUT #endasm while(1) {}; } //////////////////////////////////////////////////////////////////////// Compile, run both programs and obtain the following results:
It seems that z88dk must have 'broken' the support of long (32bit) variables and it is not able to compile / build this program correctly, Another negative point for z88dk. Let's try now with an algorithm 'famous' as the sorting algorithm quicksort (source) and we will make ordering 960 16-bit integers. The source code would be as follows: //////////////////////////////////////////////////////////////////////// // Test05sdcc.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// void quicksort(int* data, int N) { int i, j; int v, t; if( N <= 1 ) return; // Partition elements v = data[0]; i = 0; j = N; for(;;) { while(data[++i] < v && i < N) { } while(data[--j] > v) { } if( i >= j ) break; t = data[i]; data[i] = data[j]; data[j] = t; } t = data[i-1]; data[i-1] = data[0]; data[0] = t; quicksort(data, i-1); quicksort(data+i, N-i); } const int aNumbers[960] = { 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, }; main() { __asm ld a, #83 ;'S' call #0xBB5A ;TXT_OUTPUT __endasm; quicksort(aNumbers, 960); __asm ld a, #69 ;'E' call #0xBB5A ;TXT_OUTPUT __endasm; while(1) {}; } //////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////// // Test05z88dk.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// void quicksort(int* data, int N) { int i, j; int v, t; if( N <= 1 ) return; // Partition elements v = data[0]; i = 0; j = N; for(;;) { while(data[++i] < v && i < N) { } while(data[--j] > v) { } if( i >= j ) break; t = data[i]; data[i] = data[j]; data[j] = t; } t = data[i-1]; data[i-1] = data[0]; data[0] = t; quicksort(data, i-1); quicksort(data+i, N-i); } static int aNumbers[960] = { 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, 12, 67, 123, 456, 789, 98, 64, 43, 32, 1, 4, 9, 345, 444, 777, }; main() { #asm ld a, 83 ;'S' call $BB5A ;TXT_OUTPUT #endasm quicksort(aNumbers, 960); #asm ld a, 69 ;'E' call $BB5A ;TXT_OUTPUT #endasm while(1) {}; } //////////////////////////////////////////////////////////////////////// Compile, run both programs and obtain the following results (this time, we subtract the 960 integers to the size to limit to the size of algorithm, not the size of the data):
This time the difference in size is smaller, but still gaining SDCC. In terms of execution speed, SDCC takes 780 ms less in order the 960 numbers, or whatever it is, SDCC is 30% faster than z88dk. As a final test, we will compile a program a little more complex and lengthy, the program draws on the screen several lines using the famous bresenham algorithm and the functions to draw pixels on screen of tutorial Painting pixels: Introduction to video memory (C with SDCC). The program is exactly the same for SDCC and z88dk, differing only in two lines of assembly code to set the mode 0: //////////////////////////////////////////////////////////////////////// // Test06sdcc.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// void SetMode0PixelColor(unsigned char *pByteAddress, unsigned char nColor, unsigned char nPixel) { unsigned char nByte = *pByteAddress; if(nPixel == 0) { nByte &= 85; if(nColor & 1) nByte |= 128; if(nColor & 2) nByte |= 8; if(nColor & 4) nByte |= 32; if(nColor & 8) nByte |= 2; } else { nByte &= 170; if(nColor & 1) nByte |= 64; if(nColor & 2) nByte |= 4; if(nColor & 4) nByte |= 16; if(nColor & 8) nByte |= 1; } *pByteAddress = nByte; } void PutPixelMode0(unsigned char nX, unsigned char nY, unsigned char nColor) { unsigned char nPixel = 0; unsigned int nAddress = 0xC000 + ((nY / 8) * 80) + ((nY % 8) * 2048) + (nX / 2); nPixel = nX % 2; SetMode0PixelColor((unsigned char *)nAddress, nColor, nPixel); } /** * Draws a line between two points p1(p1x,p1y) and p2(p2x,p2y). * This function is based on the Bresenham's line algorithm and is highly * optimized to be able to draw lines very quickly. There is no floating point * arithmetic nor multiplications and divisions involved. Only addition, * subtraction and bit shifting are used. * * Note that you have to define your own customized setPixel(x,y) function, * which essentially lights a pixel on the screen. */ void swap(int n1, int n2) { int nAux = n1; n1 = n2; n2 = nAux; } void lineBresenham(int p1x, int p1y, int p2x, int p2y) { int F, x, y; int dy; int dx; int dy2; int dx2; int dy2_minus_dx2; int dy2_plus_dx2; if (p1x > p2x) // Swap points if p1 is on the right of p2 { swap(p1x, p2x); swap(p1y, p2y); } // Handle trivial cases separately for algorithm speed up. // Trivial case 1: m = +/-INF (Vertical line) if (p1x == p2x) { if (p1y > p2y) // Swap y-coordinates if p1 is above p2 { swap(p1y, p2y); } x = p1x; y = p1y; while (y <= p2y) { PutPixelMode0(x, y, 1); y++; } return; } // Trivial case 2: m = 0 (Horizontal line) else if (p1y == p2y) { x = p1x; y = p1y; while (x <= p2x) { PutPixelMode0(x, y, 1); x++; } return; } dy = p2y - p1y; // y-increment from p1 to p2 dx = p2x - p1x; // x-increment from p1 to p2 dy2 = (dy << 1); // dy << 1 == 2*dy dx2 = (dx << 1); dy2_minus_dx2 = dy2 - dx2; // precompute constant for speed up dy2_plus_dx2 = dy2 + dx2; if (dy >= 0) // m >= 0 { // Case 1: 0 <= m <= 1 (Original case) if (dy <= dx) { F = dy2 - dx; // initial F x = p1x; y = p1y; while (x <= p2x) { PutPixelMode0(x, y, 1); if (F <= 0) { F += dy2; } else { y++; F += dy2_minus_dx2; } x++; } } // Case 2: 1 < m < INF (Mirror about y=x line // replace all dy by dx and dx by dy) else { F = dx2 - dy; // initial F y = p1y; x = p1x; while (y <= p2y) { PutPixelMode0(x, y, 1); if (F <= 0) { F += dx2; } else { x++; F -= dy2_minus_dx2; } y++; } } } else // m < 0 { // Case 3: -1 <= m < 0 (Mirror about x-axis, replace all dy by -dy) if (dx >= -dy) { F = -dy2 - dx; // initial F x = p1x; y = p1y; while (x <= p2x) { PutPixelMode0(x, y, 1); if (F <= 0) { F -= dy2; } else { y--; F -= dy2_plus_dx2; } x++; } } // Case 4: -INF < m < -1 (Mirror about x-axis and mirror // about y=x line, replace all dx by -dy and dy by dx) else { F = dx2 + dy; // initial F y = p1y; x = p1x; while (y >= p2y) { PutPixelMode0(x, y, 1); if (F <= 0) { F += dx2; } else { x++; F += dy2_plus_dx2; } y--; } } } } main() { //SCR_SET_MODE 0 __asm ld a, #0 call #0xBC0E __endasm; lineBresenham(0, 0, 159, 199); lineBresenham(0, 199, 159, 0); lineBresenham(80, 0, 80, 199); lineBresenham(0, 100, 159, 100); while(1) {}; } //////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////// // Test06z88dk.c // Mochilote - www.cpcmania.com //////////////////////////////////////////////////////////////////////// void SetMode0PixelColor(unsigned char *pByteAddress, unsigned char nColor, unsigned char nPixel) { unsigned char nByte = *pByteAddress; if(nPixel == 0) { nByte &= 85; if(nColor & 1) nByte |= 128; if(nColor & 2) nByte |= 8; if(nColor & 4) nByte |= 32; if(nColor & 8) nByte |= 2; } else { nByte &= 170; if(nColor & 1) nByte |= 64; if(nColor & 2) nByte |= 4; if(nColor & 4) nByte |= 16; if(nColor & 8) nByte |= 1; } *pByteAddress = nByte; } void PutPixelMode0(unsigned char nX, unsigned char nY, unsigned char nColor) { unsigned char nPixel = 0; unsigned int nAddress = 0xC000 + ((nY / 8) * 80) + ((nY % 8) * 2048) + (nX / 2); nPixel = nX % 2; SetMode0PixelColor((unsigned char *)nAddress, nColor, nPixel); } /** * Draws a line between two points p1(p1x,p1y) and p2(p2x,p2y). * This function is based on the Bresenham's line algorithm and is highly * optimized to be able to draw lines very quickly. There is no floating point * arithmetic nor multiplications and divisions involved. Only addition, * subtraction and bit shifting are used. * * Note that you have to define your own customized setPixel(x,y) function, * which essentially lights a pixel on the screen. */ void swap(int n1, int n2) { int nAux = n1; n1 = n2; n2 = nAux; } void lineBresenham(int p1x, int p1y, int p2x, int p2y) { int F, x, y; int dy; int dx; int dy2; int dx2; int dy2_minus_dx2; int dy2_plus_dx2; if (p1x > p2x) // Swap points if p1 is on the right of p2 { swap(p1x, p2x); swap(p1y, p2y); } // Handle trivial cases separately for algorithm speed up. // Trivial case 1: m = +/-INF (Vertical line) if (p1x == p2x) { if (p1y > p2y) // Swap y-coordinates if p1 is above p2 { swap(p1y, p2y); } x = p1x; y = p1y; while (y <= p2y) { PutPixelMode0(x, y, 1); y++; } return; } // Trivial case 2: m = 0 (Horizontal line) else if (p1y == p2y) { x = p1x; y = p1y; while (x <= p2x) { PutPixelMode0(x, y, 1); x++; } return; } dy = p2y - p1y; // y-increment from p1 to p2 dx = p2x - p1x; // x-increment from p1 to p2 dy2 = (dy << 1); // dy << 1 == 2*dy dx2 = (dx << 1); dy2_minus_dx2 = dy2 - dx2; // precompute constant for speed up dy2_plus_dx2 = dy2 + dx2; if (dy >= 0) // m >= 0 { // Case 1: 0 <= m <= 1 (Original case) if (dy <= dx) { F = dy2 - dx; // initial F x = p1x; y = p1y; while (x <= p2x) { PutPixelMode0(x, y, 1); if (F <= 0) { F += dy2; } else { y++; F += dy2_minus_dx2; } x++; } } // Case 2: 1 < m < INF (Mirror about y=x line // replace all dy by dx and dx by dy) else { F = dx2 - dy; // initial F y = p1y; x = p1x; while (y <= p2y) { PutPixelMode0(x, y, 1); if (F <= 0) { F += dx2; } else { x++; F -= dy2_minus_dx2; } y++; } } } else // m < 0 { // Case 3: -1 <= m < 0 (Mirror about x-axis, replace all dy by -dy) if (dx >= -dy) { F = -dy2 - dx; // initial F x = p1x; y = p1y; while (x <= p2x) { PutPixelMode0(x, y, 1); if (F <= 0) { F -= dy2; } else { y--; F -= dy2_plus_dx2; } x++; } } // Case 4: -INF < m < -1 (Mirror about x-axis and mirror // about y=x line, replace all dx by -dy and dy by dx) else { F = dx2 + dy; // initial F y = p1y; x = p1x; while (y >= p2y) { PutPixelMode0(x, y, 1); if (F <= 0) { F += dx2; } else { x++; F += dy2_plus_dx2; } y--; } } } } main() { //SCR_SET_MODE 0 #asm ld a, 0 call $BC0E #endasm lineBresenham(0, 0, 159, 199); lineBresenham(0, 199, 159, 0); lineBresenham(80, 0, 80, 199); lineBresenham(0, 100, 159, 100); while(1) {}; } //////////////////////////////////////////////////////////////////////// Compile, run both programs and obtain the following results:
SDCC wins again with an incredible difference, SDCC binary runs 6 times faster than z88dk. As a picture is worth a thousand words, here's a visual comparison:
As z88dk v1.9 version is of 2009 I decided to try to compile the latest example with a recent beta to see if they have improved somewhat in recent years, concretely I used the beta of the day 05/19/2012, these are the results:
It has not improved at all...
UPDATE: The 07/09/2012 has been released the version 3.2.0 of SDCC, let's see if things have improved or worsened:
Has improved even more! UPDATE: The 11/06/2012 has been released the version 1.10 of z88dk, let's see if things have improved or worsened:
As we see, with the new version of z88dk, not much has changed from version v1.9, the results are still painful comparing with SDCC. UPDATE: The 05/20/2013 has been released the version 3.3.0 of SDCC, let's see if things have improved or worsened:
Conclusions:
In view of these results so overwhelming, I finally stopped using z88dk: Do not waste your time with z88dk (CPU cycles :-)), long life to SDCC!! You could download a zip with all files (source code, bat to compile, binary and dsk's) here: SDCC_vs_z88dk.zip |
www.CPCMania.com 2012 |