diff options
Diffstat (limited to 'bsps/powerpc/shared/altivec/README')
-rw-r--r-- | bsps/powerpc/shared/altivec/README | 184 |
1 files changed, 184 insertions, 0 deletions
diff --git a/bsps/powerpc/shared/altivec/README b/bsps/powerpc/shared/altivec/README new file mode 100644 index 0000000000..61ebb8dded --- /dev/null +++ b/bsps/powerpc/shared/altivec/README @@ -0,0 +1,184 @@ +RTEMS ALTIVEC SUPPORT +===================== + +1. History +---------- + +Altivec support was developed and maintained as a user-extension +outside of RTEMS. This extension is still available (unbundled) +from Till Straumann <strauman@slac.stanford.edu>; it is useful +if an application desires 'lazy switching' of the altivec context. + +2. Modes +-------- + +Altivec support -- the unbundled extension, that is -- can be used +in two ways: + +a. All tasks are implicitly AltiVec-enabled. + +b. Only designated tasks are AltiVec-enabled. 'Lazy-context switching' + is implemented to switch AltiVec the context. + +Note that the code implemented in this directory supports mode 'a' +and mode 'a' ONLY. For mode 'b' you need the unbundled extension +(which is completely independent of this code). + +Mode 'a' (All tasks are AltiVec-enabled) +- - - - - - - - - - - - - - - - - - - - - + +The major disadvantage of this mode is that additional overhead is +involved: tasks that never use the vector unit still save/restore +the volatile vector registers (20 registers * 16bytes each) across +every interrupt and all non-volatile registers (12 registers * 16b each) +during every context switch. + +However, saving/restoring e.g., the volatile registers is quite +fast -- on my 1GHz 7457 saving or restoring 20 vector registers +takes only about 1us or even less (if there are cache hits). + +The advantage is complete transparency to the user and full ABI +compatibility (exept for ISRs and exception handlers), see below. + +Mode 'b' (Only dedicated tasks are AltiVec-enabled) +- - - - - - - - - - - - - - - - - - - - - - - - - - + +The advantage of this mode of operation is that the vector-registers +are only saved/restored when a different, altivec-enabled task becomes +ready to run. In particular, if there is only a single altivec-enabled +task then the altivec-context *never* is switched. + +Note that this mode of operation is not supported by the code +in this directory -- you need the unbundled altivec extension +mentioned above. + +3. Compiler Options +------------------- + +Three compiler options affect AltiVec: -maltivec, -mabi=altivec and +-mvrsave=yes/no. + +-maltivec: This lets the cpp define the symbol __ALTIVEC__ and enables + gcc to emit vector instructions. Note that gcc may use the + AltiVec engine implicitly, i.e., **without you writing any + vectorized code**. + +-mabi=altivec: This option has two effects: + i) It ensures 16-byte stack alignment required by AltiVec + (even in combination with eabi which is RTEMS' default). + ii) It allows vector arguments to be passed in vector registers. + +-mvrsave=yes/no: Instructs gcc to emit code which sets the VRSAVE register + indicating which vector registers are 'currently in use'. + Because the altivec support does not use this information *) the + option has no direct affect but it is desirable to compile with + -mvrsave=no so that no unnecessary code is generated. + + *) The file vec_sup_asm.S conditionally disables usage of + the VRSAVE information if the preprocessor symbol + 'IGNORE_VRSAVE' is defined, which is the default. + + If 'IGNORE_VRSAVE' is undefined then the code *does* + use the VRSAVE information but I found that this does + not execute noticeably faster. + +IMPORTANT NOTES +=============== + +AFAIK, RTEMS uses the EABI which requires a stack alignment of only 8 bytes +which is NOT enough for AltiVec (which requires 16-byte alignment). + +There are two ways for obtaining 16-byte alignment: + +I) Compile with -mno-eabi (ordinary SYSV ABI has 16-byte alignment) +II) Compile with -mabi=altivec (extension to EABI; maintains 16-byte alignment + but also allows for passing vector arguments in vector registers) + +Note that it is crucial to compile ***absolutely everything*** with the same +ABI options (or a linker error may occur). In particular, this includes + + - newlibc multilib variant + - RTEMS proper + - application + third-party code + +IMO the proper compiler options for Mode 'a' would be + + -maltivec -mabi=altivec -mvrsave=no + +Note that the -mcpu=7400 option also enables -maltivec and -mabi=altivec +but leaves -mvrsave at some 'default' which is probably 'no'. +Compiling with -mvrsave=yes does not produce incompatible code but +may have a performance impact (since extra code is produced to maintain +VRSAVE). + +4. Multilib Variants +-------------------- + +The default GCC configuration for RTEMS contains a -mcpu=7400 multilib +variant which is the correct one to choose. + +5. BSP 'custom' file. +--------------------- + +Now that you have the necessary newlib and libgcc etc. variants +you also need to build RTEMS accordingly. + +In you BSP's make/custom/<bsp>.cfg file make sure the CPU_CFLAGS +select the desired variant: + +for mode 'a': + + CPU_CFLAGS = ... -mcpu=7400 + +Note that since -maltivec globally defines __ALTIVEC__ RTEMS automatially +enables code that takes care of switching the AltiVec context as necessary. +This is transparent to application code. + +6. BSP support +-------------- + +It is the BSP's responsibility to initialize MSR_VE, VSCR and VRSAVE +during early boot, ideally before any C-code is executed (because it +may, theoretically, use vector instructions). + +The BSP must + + - set MSR_VE + - clear VRSAVE; note that the probing algorithm for detecting + whether -mvrsave=yes or 'no' was used relies on the BSP + clearing VRSAVE during early start. Since no interrupts or + context switches happen before the AltiVec support is initialized + clearing VRSAVE is no problem even if it turns out that -mvrsave=no + was in effect (eventually a value of all-ones will be stored + in VRSAVE in this case). + - clear VSCR + +7. PSIM note +------------ + +PSIM supports the AltiVec instruction set with the exception of +the 'data stream' instructions for cache prefetching. The RTEMS +altivec support includes run-time checks to skip these instruction +when executing on PSIM. + +Note that AltiVec support within PSIM must be enabled at 'configure' +time by passing the 'configure' option + +--enable-sim-float=altivec + +Note also that PSIM's AltiVec support has many bugs. It is recommended +to apply the patches filed as an attachment with gdb bug report #2461 +prior to building PSIM. + +The CPU type and corresponding multilib must be changed when +building RTEMS/psim: + + edit make/custom/psim.cfg and change + + CPU_CFLAGS = ... -mcpu=603e + + to + + CPU_CFLAGS = ... -mcpu=7400 + +This change must be performed *before* configuring RTEMS/psim. |