| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The window underflow trap handler used %i5 which destroyed the %o5 of
the calling context. Bug introduced by
0d3b5d47429effb350448d9e9123a67db722109f.
Go back to the pre 0d3b5d47429effb350448d9e9123a67db722109f behaviour
and use the two unused instructions in the trap vector to optimize a
bit.
Close #2651.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Coding style cleanups.
* Use OS reserved trap 0x89 for IRQ Disable
* Use OS reserved trap 0x8A for IRQ Enable
* Add to SPARC CPU supplement documentation
This will result in faster Disable/Enable code since the
system trap handler does not need to decode which function
the user wants. Besides the IRQ disable/enabled can now
be inline which avoids the caller to take into account that
o0-o7+g1-g4 registers are destroyed by trap handler.
It was also possible to reduce the interrupt trap handler by
five instructions due to this.
|
|
|
|
|
|
|
|
|
|
|
| |
Save five instructions on underflow handling.
By using an optimized trap entry we can move instructions from
the window underflow function into the trap entry vector. By
setting WIM=0 and using RESTORE it is possible to move the
new WIM register content from the trapped window into the
to-be-restored register window. It is then possible to avoid
the WIM write delay.
|
|
|
|
|
| |
I see no need for waiting the 3 instruction delay for wim to be
written in this case, since the STD after does not depend on WIM
|
|
|
|
|
| |
Now that a SPARC fatal handler is defined, we no longer
need the BSP specific reset routine.
|
|
|
|
|
|
|
|
| |
Instead of calling the system call TA instruction directly it
is better paractise to isolate the trap implementation to the
system call functions.
BSP_fatal_exit() is added.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The last optimization missed was incorrect in regards to
PSR write instruction delay must be 3 instructions.
New optimizations:
* align to 32-byte cache line.
* rearrange code into three "blocks" of 4 instructions that
is executed by syscall 2 and 3. This is to optimize for
16/32 byte cache lines.
* use delay-slot instruction in trap table to reduce by one
instruction.
* use the fact that "wr %PSR" implements XOR to reduce by
one instruction.
|
|
|
|
|
|
|
|
| |
Use __bss_start available via %g2 to clear the BSS section. The usage
of _edata resulted in a copy of [_edata, __bss_start) from ROM to RAM
and then a clear to zero of this area.
Clear now only [__bss_start, _end).
|
|
|
|
|
|
| |
Use the register %g4 for the data content since it must be an even
numbered register due to the std/ldd. Use the register %g2 for the BSS
start address, so that it can be later re-used for the BSS zero loop.
|
|
|
|
| |
Use a standard function for startup on secondary processors.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use register g6 for the per-CPU control of the current processor. The
register g6 is reserved for the operating system by the SPARC ABI. On
Linux register g6 is used for a similar purpose with the same method
since 1996.
The register g6 must be initialized during system startup and then must
remain unchanged.
Since the per-CPU control is used in all critical sections of the
operating system, this is a performance optimization for the operating
system core procedures. An additional benefit is that the low-level
context switch and interrupt processing code is now identical on non-SMP
and SMP configurations.
|
| |
|
| |
|
|
|
|
| |
Add _LEON3_Get_current_processor().
|
|
|
|
|
|
|
| |
Avoid usage of the same stack area by multiple secondary processors at
the same time.
Avoid magic delay loops.
|
|
|
|
| |
Add doxygen to the header files in sparc/shared/include directory.
|
|
|