Spectacular!

A private (members-only) forum for discussing all issues related to the Beta test of Native mode devices.
Locked
GTBecker
Posts: 616
Joined: 17 January 2006, 19:59 PM
Location: Cape Coral

Spectacular!

Post by GTBecker »

A 32-bit floating-point multiplicitive inverse of a 3x3 matrix takes 1830µS on a ZX-24a; it takes 464µS on a ZX-24n, a 3.95x speedup. Fantastic!

FYI, this is 8.16 times the speed of a BX-24 which takes 3789µS, and 7.45 times as fast as a BX-24p (3457µS).

Superb!
Tom
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Re: Spectacular!

Post by mikep »

GTBecker wrote:A 32-bit floating-point multiplicitive inverse of a 3x3 matrix takes 1830µS on a ZX-24a; it takes 464µS on a ZX-24n, a 3.95x speedup. Fantastic!

FYI, this is 8.16 times the speed of a BX-24 which takes 3789µS, and 7.45 times as fast as a BX-24p (3457µS).
Yes this is good but not entirely unexpected. The clock speed of the ZX devices is twice that of the BX so not surprisingly it is half the speed. The 4x speedup is good and shows the overhead of the VM and multiple calls into the VM. I'm sure GCC also optimized some of your code as well.

The next step is to replace your ZX function with custom C or assembler code which should enable you to get this time down to 350µs or less. No more Micromega uFPU :)
Mike Perks
stevech
Posts: 715
Joined: 22 February 2006, 20:56 PM

Post by stevech »

but let's not forget - speed vs. code density for p-code. Sometimes important.
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Post by mikep »

stevech wrote:but let's not forget - speed vs. code density for p-code. Sometimes important.
Absolutely. For many applications it is completely unnecessary to use native mode and VM instructions are all you need. Even for Tom's application, it might that the matrix inverse time of ~1.8ms doesn't affect overall system performance or end user perceived performance. Only optimize for execution performance where and when you need to.

The interesting subject for Don is to provide some help so his customers know which ZX device they should buy. I doubt he is going to provide any kind of "upgrade" program and I'm guessing that native mode devices will cost more than their ZVM cousins.
Mike Perks
GTBecker
Posts: 616
Joined: 17 January 2006, 19:59 PM
Location: Cape Coral

Post by GTBecker »

Dupe deleted.
Last edited by GTBecker on 11 February 2008, 12:09 PM, edited 1 time in total.
Tom
GTBecker
Posts: 616
Joined: 17 January 2006, 19:59 PM
Location: Cape Coral

Spectacular!

Post by GTBecker »

> ... it might that the matrix inverse time of ~1.8ms doesn't affect
overall system performance...

The system I'm developing is an imaging system, of a sort, and I am
forever working to improve the effective resolution and response time.
So far, faster is always better for me; a 4x speed improvement is
welcome at any memory cost - as long as I haven't yet run out.


Tom
Tom
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Re: Spectacular!

Post by mikep »

GTBecker wrote:The system I'm developing is an imaging system, of a sort, and I am forever working to improve the effective resolution and response time. So far, faster is always better for me; a 4x speed improvement is welcome at any memory cost - as long as I haven't yet run out.
Or you could just get a faster processor like an ARM - save yourself a lot of trouble I would think. We are off-topic now so we can continue this discussion elsewhere or in the general forum.
Mike Perks
GTBecker
Posts: 616
Joined: 17 January 2006, 19:59 PM
Location: Cape Coral

Re: Spectacular!

Post by GTBecker »

mikep wrote:[...] just get a faster processor like an ARM - save yourself a lot of trouble...
Quite on topic. Yes, I could have selected another processor for some high-speed tasks but I've, so far, chosen to avoid the time necessarily spent learning another platform environment. It's also been handy to have a common philosophy and language among the three processors now in the system but, I admit, it is also sometimes a compromise.

Again, the apparent 4x speedup, at least for these heavy-math functions, is significant and welcome.
Tom
Don_Kirby
Posts: 341
Joined: 15 October 2006, 3:48 AM
Location: Long Island, New York

Post by Don_Kirby »

On a similar note...

I have been having trouble getting an application working correctly, it would hang at startup. After playing around with task stack sizes (and not getting anywhere) I decided to add some delays to see exactly where the hang was occurring.

After narrowing down the problem to one particular task, and adding a 0.1 second delay in the beginning of the task, the hang disappeared. Now, of course, I'm still looking for the root cause, but that's another topic.

The point is, previously, the task cycled at about 100Hz running on a ZX40 (VM version). On the '24n, I'm seeing a speed increase of between 2400 and 2500%. At 2500 loops per second, I can certainly see why there might be some timing issue in the code that never showed up before.

Of course these numbers are meaningless in any real sense, as they apply only to this particular application. For reference only, the task in question performs only integer math, some string operations, I2C communications, and a bunch of comparing this variable to that variable.

I suppose the lesson here is that, along with the increase in speed, comes the inevitable increase in careful planning to avoid unwanted interactions.

In my particular case, I need to purposely slow down the offending task, as it is part of the UI, and simply does not need to run as fast as it is (on the new device).

-Don
Locked