summaryrefslogtreecommitdiffstats
path: root/bsps/powerpc/shared/altivec/README
diff options
context:
space:
mode:
Diffstat (limited to 'bsps/powerpc/shared/altivec/README')
-rw-r--r--bsps/powerpc/shared/altivec/README184
1 files changed, 184 insertions, 0 deletions
diff --git a/bsps/powerpc/shared/altivec/README b/bsps/powerpc/shared/altivec/README
new file mode 100644
index 0000000000..61ebb8dded
--- /dev/null
+++ b/bsps/powerpc/shared/altivec/README
@@ -0,0 +1,184 @@
+RTEMS ALTIVEC SUPPORT
+=====================
+
+1. History
+----------
+
+Altivec support was developed and maintained as a user-extension
+outside of RTEMS. This extension is still available (unbundled)
+from Till Straumann <strauman@slac.stanford.edu>; it is useful
+if an application desires 'lazy switching' of the altivec context.
+
+2. Modes
+--------
+
+Altivec support -- the unbundled extension, that is -- can be used
+in two ways:
+
+a. All tasks are implicitly AltiVec-enabled.
+
+b. Only designated tasks are AltiVec-enabled. 'Lazy-context switching'
+ is implemented to switch AltiVec the context.
+
+Note that the code implemented in this directory supports mode 'a'
+and mode 'a' ONLY. For mode 'b' you need the unbundled extension
+(which is completely independent of this code).
+
+Mode 'a' (All tasks are AltiVec-enabled)
+- - - - - - - - - - - - - - - - - - - - -
+
+The major disadvantage of this mode is that additional overhead is
+involved: tasks that never use the vector unit still save/restore
+the volatile vector registers (20 registers * 16bytes each) across
+every interrupt and all non-volatile registers (12 registers * 16b each)
+during every context switch.
+
+However, saving/restoring e.g., the volatile registers is quite
+fast -- on my 1GHz 7457 saving or restoring 20 vector registers
+takes only about 1us or even less (if there are cache hits).
+
+The advantage is complete transparency to the user and full ABI
+compatibility (exept for ISRs and exception handlers), see below.
+
+Mode 'b' (Only dedicated tasks are AltiVec-enabled)
+- - - - - - - - - - - - - - - - - - - - - - - - - -
+
+The advantage of this mode of operation is that the vector-registers
+are only saved/restored when a different, altivec-enabled task becomes
+ready to run. In particular, if there is only a single altivec-enabled
+task then the altivec-context *never* is switched.
+
+Note that this mode of operation is not supported by the code
+in this directory -- you need the unbundled altivec extension
+mentioned above.
+
+3. Compiler Options
+-------------------
+
+Three compiler options affect AltiVec: -maltivec, -mabi=altivec and
+-mvrsave=yes/no.
+
+-maltivec: This lets the cpp define the symbol __ALTIVEC__ and enables
+ gcc to emit vector instructions. Note that gcc may use the
+ AltiVec engine implicitly, i.e., **without you writing any
+ vectorized code**.
+
+-mabi=altivec: This option has two effects:
+ i) It ensures 16-byte stack alignment required by AltiVec
+ (even in combination with eabi which is RTEMS' default).
+ ii) It allows vector arguments to be passed in vector registers.
+
+-mvrsave=yes/no: Instructs gcc to emit code which sets the VRSAVE register
+ indicating which vector registers are 'currently in use'.
+ Because the altivec support does not use this information *) the
+ option has no direct affect but it is desirable to compile with
+ -mvrsave=no so that no unnecessary code is generated.
+
+ *) The file vec_sup_asm.S conditionally disables usage of
+ the VRSAVE information if the preprocessor symbol
+ 'IGNORE_VRSAVE' is defined, which is the default.
+
+ If 'IGNORE_VRSAVE' is undefined then the code *does*
+ use the VRSAVE information but I found that this does
+ not execute noticeably faster.
+
+IMPORTANT NOTES
+===============
+
+AFAIK, RTEMS uses the EABI which requires a stack alignment of only 8 bytes
+which is NOT enough for AltiVec (which requires 16-byte alignment).
+
+There are two ways for obtaining 16-byte alignment:
+
+I) Compile with -mno-eabi (ordinary SYSV ABI has 16-byte alignment)
+II) Compile with -mabi=altivec (extension to EABI; maintains 16-byte alignment
+ but also allows for passing vector arguments in vector registers)
+
+Note that it is crucial to compile ***absolutely everything*** with the same
+ABI options (or a linker error may occur). In particular, this includes
+
+ - newlibc multilib variant
+ - RTEMS proper
+ - application + third-party code
+
+IMO the proper compiler options for Mode 'a' would be
+
+ -maltivec -mabi=altivec -mvrsave=no
+
+Note that the -mcpu=7400 option also enables -maltivec and -mabi=altivec
+but leaves -mvrsave at some 'default' which is probably 'no'.
+Compiling with -mvrsave=yes does not produce incompatible code but
+may have a performance impact (since extra code is produced to maintain
+VRSAVE).
+
+4. Multilib Variants
+--------------------
+
+The default GCC configuration for RTEMS contains a -mcpu=7400 multilib
+variant which is the correct one to choose.
+
+5. BSP 'custom' file.
+---------------------
+
+Now that you have the necessary newlib and libgcc etc. variants
+you also need to build RTEMS accordingly.
+
+In you BSP's make/custom/<bsp>.cfg file make sure the CPU_CFLAGS
+select the desired variant:
+
+for mode 'a':
+
+ CPU_CFLAGS = ... -mcpu=7400
+
+Note that since -maltivec globally defines __ALTIVEC__ RTEMS automatially
+enables code that takes care of switching the AltiVec context as necessary.
+This is transparent to application code.
+
+6. BSP support
+--------------
+
+It is the BSP's responsibility to initialize MSR_VE, VSCR and VRSAVE
+during early boot, ideally before any C-code is executed (because it
+may, theoretically, use vector instructions).
+
+The BSP must
+
+ - set MSR_VE
+ - clear VRSAVE; note that the probing algorithm for detecting
+ whether -mvrsave=yes or 'no' was used relies on the BSP
+ clearing VRSAVE during early start. Since no interrupts or
+ context switches happen before the AltiVec support is initialized
+ clearing VRSAVE is no problem even if it turns out that -mvrsave=no
+ was in effect (eventually a value of all-ones will be stored
+ in VRSAVE in this case).
+ - clear VSCR
+
+7. PSIM note
+------------
+
+PSIM supports the AltiVec instruction set with the exception of
+the 'data stream' instructions for cache prefetching. The RTEMS
+altivec support includes run-time checks to skip these instruction
+when executing on PSIM.
+
+Note that AltiVec support within PSIM must be enabled at 'configure'
+time by passing the 'configure' option
+
+--enable-sim-float=altivec
+
+Note also that PSIM's AltiVec support has many bugs. It is recommended
+to apply the patches filed as an attachment with gdb bug report #2461
+prior to building PSIM.
+
+The CPU type and corresponding multilib must be changed when
+building RTEMS/psim:
+
+ edit make/custom/psim.cfg and change
+
+ CPU_CFLAGS = ... -mcpu=603e
+
+ to
+
+ CPU_CFLAGS = ... -mcpu=7400
+
+This change must be performed *before* configuring RTEMS/psim.