I've found more improvement in the FPU code. The 3x3 inversion on the FPU now takes 465µS; that's about the best I think I can do for now. I'm eager to receive a new FPU version that provides multi-register SPI moves, including byte-reversed transfers, soon.
The ZBasic code has simplified from a determinant improvement, too, so it currently requires 1798µS. You'll find the code up-thread.
MicroMega's uFPU
I'm finally getting around to apply the µM-FPU to a Kalman filter in an IMU with a beta of FPU microcode that provides several new instructions and expanded serial capability.
Raw NMEA data no longer needs to go through the processor. The FPU can directly buffer incoming NMEA and post status when a sentence is ready to be decoded, which is only one function call, making the GPS data available in a series of registers. At the moment, I convert 11 fields from three sentence types.
Any number of mixed-type registers can now be transferred in one call, in reverse byte order, directly into a series of mixed-type RAM vars in the processor; that takes about 1mS, essentially independent of the register count. So, just two FPU instructions will get you current GPS data, ready to calculate. Of course, the data can remain on the FPU for navigation math there.
My 3x3 matrix multiplicative inverse FPU code now runs in 1405µS, including an SPI transfer of nine floats.
FYI.
Raw NMEA data no longer needs to go through the processor. The FPU can directly buffer incoming NMEA and post status when a sentence is ready to be decoded, which is only one function call, making the GPS data available in a series of registers. At the moment, I convert 11 fields from three sentence types.
Any number of mixed-type registers can now be transferred in one call, in reverse byte order, directly into a series of mixed-type RAM vars in the processor; that takes about 1mS, essentially independent of the register count. So, just two FPU instructions will get you current GPS data, ready to calculate. Of course, the data can remain on the FPU for navigation math there.
My 3x3 matrix multiplicative inverse FPU code now runs in 1405µS, including an SPI transfer of nine floats.
FYI.
Tom
Tom,
I know you have done considerable development with the FPU. I see in the Yahoo forum that version 3 of both the FPU and the IDE have been released. When do you think the IDE will contain a code generator for either ZBasic or BasicX? Have you got any influence at MicroMega?
Any enlightenment will be appreciated.
Vic
I know you have done considerable development with the FPU. I see in the Yahoo forum that version 3 of both the FPU and the IDE have been released. When do you think the IDE will contain a code generator for either ZBasic or BasicX? Have you got any influence at MicroMega?
Any enlightenment will be appreciated.
Vic
Vic Fraenckel
KC2GUI
windswaytoo ATSIGN gmail DOT com
KC2GUI
windswaytoo ATSIGN gmail DOT com
> ... Have you got any influence at MicroMega?
No more than any other user, I imagine.
No, neither ZBasic nor BasicX code is generated by the FPU IDE. I don't use the part that way, though, so I have not missed the feature. If it was available, it would offer drag-and-dropable processor source to call SPICmd with appropriate parameters to execute the function you need. It is intended to generate code that you would insert in your processor source, one FPU command at a time.
For processors like these that already have float functions onboard, calling a single external FPU function might make little sense unless it is not already available on the processor, like the matrix operations. Data needs to be moved to and from the FPU, too, per function, using that method. Still, for multiple float functions - a moderately complex algorithm, perhaps - transferring data and a series of instructions can be significantly faster than doing the same process onboard the processor, I've found.
Instead, I write FPU functions, a series of FPU commands that are stored on the part and executed via a single function call from the processor. When the function autonomously completes it interrupts the processor, which makes another function call. For this, the IDE generates FPU code that is stored in FPU flash via its own debug serial port. For me, that works fine.
Tom
No more than any other user, I imagine.
No, neither ZBasic nor BasicX code is generated by the FPU IDE. I don't use the part that way, though, so I have not missed the feature. If it was available, it would offer drag-and-dropable processor source to call SPICmd with appropriate parameters to execute the function you need. It is intended to generate code that you would insert in your processor source, one FPU command at a time.
For processors like these that already have float functions onboard, calling a single external FPU function might make little sense unless it is not already available on the processor, like the matrix operations. Data needs to be moved to and from the FPU, too, per function, using that method. Still, for multiple float functions - a moderately complex algorithm, perhaps - transferring data and a series of instructions can be significantly faster than doing the same process onboard the processor, I've found.
Instead, I write FPU functions, a series of FPU commands that are stored on the part and executed via a single function call from the processor. When the function autonomously completes it interrupts the processor, which makes another function call. For this, the IDE generates FPU code that is stored in FPU flash via its own debug serial port. For me, that works fine.
Tom
Tom
-
- Posts: 163
- Joined: 24 March 2008, 23:33 PM
- Location: Southern California (Blue)
Are you still using external FPU devices for matrix inversions, or are you satisfied with the speed of the native chips? I'm preparing to implement a Kalman filter for my GBOT90 project to improve the IR sensor readings.
http://www.youtube.com/watch?v=bxx14Xe2iNg
http://www.youtube.com/watch?v=bxx14Xe2iNg