The standard XLL+ sample project, AvgOpt, has been revised to support GPU code. The new sample project is named AvgOptCuda.
The project contains both 32-bit and 64-bit builds. You should make sure that you use the appropriate build for your version of Excel.
The table below gives some sample performance ratios on various hardware combinations. The time taken (in milliseconds) for a calculation is shown for both double- and single-precision calculations, along with the ratio of GPU performance to CPU performance.
The best performance in the tables below reduced a 40 second calculation to less than 1/5th of a second; this order of performance improvement may well be considered worth the programming effort required.
100,000 iterations | ||||||
CPU | GPU | Excel | Precision | CPU (ms) | GPU (ms) | Ratio |
i7-2600 (3.40 GHz) | Geforce GTX 560 Ti (1.25 Gb) | 32-bit | double | 3626 | 49 | 74 |
i7-2600 (3.40 GHz) | Geforce GTX 560 Ti (1.25 Gb) | 32-bit | single | 3981 | 35 | 113 |
i7 870 (2.93 GHz) | Geforce GTX 460 (1 Gb) | 64-bit | double | 3978 | 70 | 57 |
i7 870 (2.93 GHz) | Geforce GTX 460 (1 Gb) | 64-bit | single | 3649 | 40 | 91 |
200,000 iterations | ||||||
CPU | GPU | Excel | Precision | CPU (ms) | GPU (ms) | Ratio |
i7-2600 (3.40 GHz) | Geforce GTX 560 Ti (1.25 Gb) | 32-bit | double | 7246 | 77 | 94 |
i7-2600 (3.40 GHz) | Geforce GTX 560 Ti (1.25 Gb) | 32-bit | single | 7910 | 51 | 156 |
i7 870 (2.93 GHz) | Geforce GTX 460 (1 Gb) | 64-bit | double | 7938 | 121 | 65 |
i7 870 (2.93 GHz) | Geforce GTX 460 (1 Gb) | 64-bit | single | 7237 | 62 | 116 |
1,000,000 iterations | ||||||
CPU | GPU | Excel | Precision | CPU (ms) | GPU (ms) | Ratio |
i7-2600 (3.40 GHz) | Geforce GTX 560 Ti (1.25 Gb) | 32-bit | double | 36335 | 311 | 117 |
i7-2600 (3.40 GHz) | Geforce GTX 560 Ti (1.25 Gb) | 32-bit | single | 39983 | 172 | 232 |
i7 870 (2.93 GHz) | Geforce GTX 460 (1 Gb) | 64-bit | double | 39610 | 551 | 72 |
i7 870 (2.93 GHz) | Geforce GTX 460 (1 Gb) | 64-bit | single | 35991 | 252 | 143 |