Upon executing my simple, double for-loop code, iterating over 320 colums and 240 rows, I had to wait for--what at the time seemed like--an astonishingly long time. In fact, when I first executed the algorithm, I thought that my board froze.
But it didn't. I was aware, on a theoretical level, that NETMF is a wholly interpreted language (unlike its bigger cousins). But I was not ready for the performance penalty this entailed. Becoming curious about how long it takes to execute some basic operations, I devised some time testing code. The essence of it looks like:
// Simple for loop start = Microsoft.SPOT.Hardware.Utility.GetMachineTime(); for ( x = 0; x < iterations; x++ ) { // emtpy loop } data.SimpleForLoop = Microsoft.SPOT.Hardware.Utility.GetMachineTime().Ticks - start.Ticks; data.SimpleForLoop /= iterations; // get single iteration value in ticks data.SimpleForLoop /= 10; // get single iteration value in microseconds accumulator.SimpleForLoop += data.SimpleForLoop; data.SimpleForLoop = accumulator.SimpleForLoop / additions; // average over multiple runs
This basic empty loop test gave me a baseline for the overhead for all additional testing. Curiously enough, 'iteration' is declared as a 'const int' rather than a variable. As a result the speed of the loop was comparable to hard-coding a constant integer value, whereas a variable added approximately 30-40% extra time. The following table lists the results of the tests, in microseconds (rounded down):
Task | Time |
---|---|
Simple for loop: | 54 |
Assignment of const ( y = 5 ): | 7 |
Assignment of var ( y = z ): | 11 |
Multiply constants ( 3 * 5 ): | 0 |
Multiply with var ( 3 * z ): | 10 |
Compare two constants ( 3 > 5 ): | 11 |
Compare with var ( y > 5 ): | 41 |
Compare 2 vars ( y > z ): | 45 |
Now, the interesting thing about these results is, that they are created with a 72MHz processor. That means that a simple empty for-loop iteration takes approximately 4000 cycles. Pretty hefty overhead if you need any sort of time relevant operation, or need to process data in excess of a few hundred elements.
As it turns out, not all is lost! In their infinite wisdom, GHI created a solution aptly named RLP (which stands for Runtime Loadable Procedures). In a nutshell, you can load a pre-compiled native code procedure into memory, at run-time, and execute it from NETMF environment. I'll let the coolness of that sink in for a second.
Generation of these pre-compiled procedures isn't very straight forward, but if you follow the instruction steps as found on GHI's website (there are a few How-To's available), you'll be running in no time. In fact, my next post may just demonstrate some neat image processing as performed on a Gadgeteer FEZ Spider board.
No comments:
Post a Comment