Stack Size

A private (members-only) forum for discussing all issues related to the Beta test of Native mode devices.
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

mikep wrote:How? What about an explicit placement of the heap i.e defining the size of RAM that can be used?
The heap limit is the demarcation point between the Main() task stack and the heap. The use of "Option HeapSize &Hffff" is treated as a special case to place the heap limit at the first byte of external RAM and the end of the Main() TCB will be right at the end of internal RAM. This is only useful, of course, for ZX devices that support external RAM.

Otherwise, "Option HeapSize" is used to specify the size of the heap, in bytes. The end of the task stack for Main() is then located that many bytes from the end of RAM (internal + external, if any).

There are other methods to control the division of RAM between the Main() task stack and the heap. "Option MainTaskStackSize" is used if you want explicitly specify the size of the Main() task stack and dedicate the remaining RAM to the heap. Lastly, "Option HeapLimit" can be used to specify the address of the heap limit. This is likely to be seldom used but may be useful in special situations. There are also command line options that correspond to these directives.

The division of RAM between the Main() task stack and the heap is done in the call to taskStartMain() which is the last instruction in the C main() function that is generated. That system function prepares the Main() task (whose name is zf_main() or some variation thereof) for execution.
- Don Kinzer
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Post by mikep »

dkinzer wrote:The heap limit is the demarcation point between the Main() task stack and the heap....
I really should read the provided documentation :oops:
Mike Perks
stevech
Posts: 715
Joined: 22 February 2006, 20:56 PM

Post by stevech »

this topic of stack use in the native compiler warrants a concise explanation and guidelines in the documentation - so as to avoid numerous self-inflicted bugs.
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Post by mikep »

mikep wrote:I added a call to TaskHeadroom on every iteration. Here is a table of stack sizes and headroom:
[table]
[mrow]Stack Size[mcol]Reported TaskHeadroom[mcol]Minimum Size
[row]53[col]0, then 65535 then hang[col]N/A
[row]54[col]Hang[col]N/A
[row]55[col]0[col]55
[row]60[col]7 then goes down to 5[col]55
[row]70[col]15[col]55
[row]80[col]27[col]53
[row]90[col]35[col]55
[row]100[col]Hang[col]unknown
[row]110[col]55[col]55[/table]
I changed manytask.bas to use just a single task and it reported consistently that it needed 53 bytes for the stack. Here is my guess of what is on the stack:
  • TCB - 12 bytes
  • Context - 35 bytes
  • Parameter - 2 bytes
  • 2 register pushes - 2 bytes (derived from generated object code)
  • Counter local variable - 1 byte
That makes a total of 52 bytes. From this you might conclude that the RTC ISRs is not using this stack. However that doesn't seem right either. So something is wrong here but I just haven't figured out what it is.

(BTW If you are wondering why I'm pursuing this, it is because I believe there might be a problem somewhere).
Mike Perks
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

mikep wrote:[S]omething is wrong here but I just haven't figured out what it is.
Upon reflection, it occurred to me that the task context (35 bytes) and the ISR stack use are not additive. As long as ISR stack use remains below the 35 bytes needed to store the context, it needn't be considered.

The reason that this is so is because both the context storage and ISR stack use occur on top of the normal call/return stack use but they will never be required at the same time.
mikep wrote:I believe there might be a problem somewhere.
I agree that the unexpected behavior needs further investigation. Perhaps you would post the code that you're using so that I can try to reproduce the behavior.
- Don Kinzer
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Post by mikep »

dkinzer wrote:
mikep wrote:[S]omething is wrong here but I just haven't figured out what it is.
Upon reflection, it occurred to me that the task context (35 bytes) and the ISR stack use are not additive. As long as ISR stack use remains below the 35 bytes needed to store the context, it needn't be considered.
mikep wrote:I believe there might be a problem somewhere.
I agree that the unexpected behavior needs further investigation. Perhaps you would post the code that you're using so that I can try to reproduce the behavior.
Now I know ISRs and the context reuse the stack then my calculation of 52 bytes comes in below the size calculated from TaskHeadroom().

The testcase I have been using is your native version of manytask.bas. There is another problem I'm seeing and will post in a new thread once I have investigated it more.
Mike Perks
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

mikep wrote:[M]y calculation of 52 bytes comes in below the size calculated from TaskHeadroom().
Invoking System.TaskHeadRoom() uses 7 bytes of stack space which introduces a 7-byte uncertainty in the results of the calculation. That is, the stack space used by the invocation may or may not have been used prior to the call.

After reviewing the code, I can see a way to reduce the uncertainty to 2 bytes in the case where the head room of the current task is being examined, i.e. System.TaskHeadRoom() is invoked with no task stack explicitly specified. The extra stack space is needed in the general case to validate the task stack prior to checking the head room.
- Don Kinzer
stevech
Posts: 715
Joined: 22 February 2006, 20:56 PM

Post by stevech »

It dawns on me that locals (automatics) are probably allocated not from the AVR's stack, but rather, from storage (heap?).

Would this be true too for automatics declared in an ISR that does pushes to save the state of the CPU for the interrupted code?
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

stevech wrote:locals (automatics) are probably allocated not from the AVR's stack, but rather, from storage (heap?).
The avr-gcc compiler tries to keep local variables in registers. If the the size of the local data is too large (or if the "address of" operator is used on any local data item), space is allocated from the AVR stack (i.e. the ZBasic task's stack) for the local data.

Which registers the compiler chooses to use depends on the code and whether it calls other functions or not. Since a called function is allowed to modify r0, r18-r27 and r30-r31 those registers either need to be avoided or saved/restored before/after the call. If registers r1-r17, r28, r29 are used within a function, they must be saved at the beginning of the function and restored at the end.

These complexities prevent the ZBasic compiler from making an estimate of stack use for a task. It may be possible to analyze the generated code to determine stack use - a subject that will require further research.
- Don Kinzer
stevech
Posts: 715
Joined: 22 February 2006, 20:56 PM

Post by stevech »

Sorry, Don - I am having a slow brain day...
Upon reflection, it occurred to me that the task context (35 bytes) and the ISR stack use are not additive. As long as ISR stack use remains below the 35 bytes needed to store the context, it needn't be considered.
If ZBasic Native relies on C's storage allocator for automatics - this being the same stack pointer (or CPU registers) and RAM that is used for interrupts, call/return and push/pop - then my confusion is as follows. The stack space for any task would have to have space for the worst-case nesting of function calls in that task plus space for all the stack that the ISR with the largest RAM need uses. And the ISR could call functions that have automatics. Just ordinary stuff here.
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

stevech wrote:The stack space for any task would have to have space for the worst-case nesting of function calls in that task plus space for all the stack that the ISR with the largest RAM need uses.
That is all true. My point was that the stack space required for the task context, saved while a task is suspended, doesn't add to the extra space required for the set of possible ISRs. Rather, the task switch can be considered to be yet another "ISR" with a maximum stack use of 35 bytes. It is only if some other (user-supplied) ISR requires more than 35 bytes that the effect of those ISRs needs to be considered.
stevech wrote:And the ISR could call functions that have automatics.
In general, yes. The system-provided ISRs don't do so and nearly all of them are written in assembly language.

In the final analysis, while this information is useful for deriving a strategy to reduce stack use, the only metric available currently is the data returned by System.TaskHeadRoom(). The caveat is that that data is only useful if the task has executed through the worst-case secenario prior to the call.
- Don Kinzer
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Post by mikep »

dkinzer wrote:Invoking System.TaskHeadRoom() uses 7 bytes of stack space which introduces a 7-byte uncertainty in the results of the calculation. That is, the stack space used by the invocation may or may not have been used prior to the call.
I made two mistakes in my calculation above. The counter local variable is stored in a register. But I forgot the 2 byte return address for the call to taskSleep(). My calculation now adds up to 53 bytes as follows:
  • TCB - 12 bytes
  • Task() function parameter - 2 bytes
  • Task() 2 register pushes - 2 bytes (derived from generated object code)
  • Nested function call - 2 bytes
  • Context - 35 bytes
The diagram below shows how I think the stack is used for various scenarios. The common stack usage for each scenario is the TCB, task function parameters (2 bytes in this case), saved registers (2 bytes in this case), and local variables (0 bytes in this case):
  • The scenario on the left shows a call to the sleep() function which requires a return address and the saving of the task context.
  • The scenario in the middle shows what happens when an ISR is invoked while in the middle of the function execution.
  • The scenario on the right shows what happens when the task calls another function (like sleep) and then an ISR is invoked.
I haven't quite worked out yet what happens when the task is saving its context and an ISR comes in. Presumably this could increase the stack size still further. Ideally interrupts should be turned off while saving one task context, searching for the next runnable task, and restoring another task context. However this might take too much time.
Attachments
Native Mode Task Stack Usage
Native Mode Task Stack Usage
native_mode_stack_usage.jpg (33.13 KiB) Viewed 25 times
Mike Perks
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

mikep wrote:I haven't quite worked out yet what happens when the task is saving its context and an ISR comes in.
This will never occur because interrupts are disabled early in the task switching process.

A task switch can occur in the following scenarios:
  • An RTC timer interrupt has occurred. In this case, interrupts are already disabled. If another task is ready to run the task switch is performed, otherwise control returns to the current task.
  • A hardware interrupt has occurred. Interrupts are also already disabled in this case. If a task is awaiting the interrupt a task switch is performed. Otherwise, control returns to the current task.
  • The current task calls Sleep(), Delay(), Yield() or RunTask(). In this case, interrupts are disabled at the outset of the switching process.
- Don Kinzer
mikep
Posts: 796
Joined: 24 September 2005, 15:54 PM

Post by mikep »

dkinzer wrote:This will never occur because interrupts are disabled early in the task switching process
Which is what I thought as many other embedded multi-tasking OSes work that way. Is the diagram correct too? The diagram might be helpful for advanced users now we know how it works. It can also be used to provide a clearer explanation of how the stack is used and what TaskHeadroom() might return.

Reducing the stack overhead for System.TaskHeadroom() with no parameters from 7 to 2 bytes is also valuable.
Mike Perks
Locked