@c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @ifinfo @end ifinfo @chapter Texas Instruments C3x/C4x Specific Information The Real Time Executive for Multiprocessor Systems (RTEMS) is designed to be portable across multiple processor architectures. However, the nature of real-time systems makes it essential that the application designer understand certain processor dependent implementation details. These processor dependencies include calling convention, board support package issues, interrupt processing, exact RTEMS memory requirements, performance data, header files, and the assembly language interface to the executive. This document discusses the Texas Instrument C3x/C4x architecture dependencies in this port of RTEMS. The C3x/C4x family has a wide variety of CPU models within it. The following CPU model numbers could be supported by this port: @itemize @item C30 - TMSXXX @item C31 - TMSXXX @item C32 - TMSXXX @item C41 - TMSXXX @item C44 - TMSXXX @end itemize Initiially, this port does not include full support for C4x models. Primarily, the C4x specific implementations of interrupt flag and mask management routines have not been completed. It is highly recommended that the RTEMS application developer obtain and become familiar with the documentation for the processor being used as well as the documentation for the family as a whole. @subheading Architecture Documents For information on the Texas Instruments C3x/C4x architecture, refer to the following documents available from VENDOR (@file{http//www.ti.com/}): @itemize @bullet @item @cite{XXX Family Reference, Texas Instruments, PART NUMBER}. @end itemize @subheading MODEL SPECIFIC DOCUMENTS For information on specific processor models and their associated coprocessors, refer to the following documents: @itemize @bullet @item @cite{XXX MODEL Manual, Texas Instruments, PART NUMBER}. @item @cite{XXX MODEL Manual, Texas Instruments, PART NUMBER}. @end itemize @c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @section CPU Model Dependent Features Microprocessors are generally classified into families with a variety of CPU models or implementations within that family. Within a processor family, there is a high level of binary compatibility. This family may be based on either an architectural specification or on maintaining compatibility with a popular processor. Recent microprocessor families such as the SPARC or PowerPC are based on an architectural specification which is independent or any particular CPU model or implementation. Older families such as the M68xxx and the iX86 evolved as the manufacturer strived to produce higher performance processor models which maintained binary compatibility with older models. RTEMS takes advantage of the similarity of the various models within a CPU family. Although the models do vary in significant ways, the high level of compatibility makes it possible to share the bulk of the CPU dependent executive code across the entire family. Each processor family supported by RTEMS has a list of features which vary between CPU models within a family. For example, the most common model dependent feature regardless of CPU family is the presence or absence of a floating point unit or coprocessor. When defining the list of features present on a particular CPU model, one simply notes that floating point hardware is or is not present and defines a single constant appropriately. Conditional compilation is utilized to include the appropriate source code for this CPU model's feature set. It is important to note that this means that RTEMS is thus compiled using the appropriate feature set and compilation flags optimal for this CPU model used. The alternative would be to generate a binary which would execute on all family members using only the features which were always present. This chapter presents the set of features which vary across the various implementations of the C3x/C4x architecture that are of importance to rtems. the set of cpu model feature macros are defined in the file cpukit/score/cpu/c4x/rtems/score/c4x.h and are based upon the particular cpu model defined in the bsp's custom configuration file as well as the compilation command line. @subsection CPU Model Name The macro @code{CPU_MODEL_NAME} is a string which designates the name of this cpu model. for example, for the c32 processor, this macro is set to the string "c32". @subsection Floating Point Unit The Texas Instruments C3x/C4x family makes little distinction between the various cpu registers. Although floating point operations may only be performed on a subset of the cpu registers, these same registers may be used for normal integer operations. as a result of this, this port of rtems makes no distinction between integer and floating point contexts. The routine @code{_CPU_Context_switch} saves all of the registers that comprise a task's context. the routines that initialize, save, and restore floating point contexts are not present in this port. Moreover, there is no floating point context pointer and the code in @code{_Thread_Dispatch} that manages the floating point context switching process is disabled on this port. This not only simplifies the port, it also speeds up context switches by reducing the code involved and reduces the code space footprint of the executive on the Texas Instruments C3x/C4x. @c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @section Calling Conventions Each high-level language compiler generates subroutine entry and exit code based upon a set of rules known as the compiler's calling convention. These rules address the following issues: @itemize @bullet @item register preservation and usage @item parameter passing @item call and return mechanism @end itemize A compiler's calling convention is of importance when interfacing to subroutines written in another language either assembly or high-level. Even when the high-level language and target processor are the same, different compilers may use different calling conventions. As a result, calling conventions are both processor and compiler dependent. The GNU Compiler Suite follows the same calling conventions as the Texas Instruments toolset. @subsection Processor Background The TI C3x and C4x processors support a simple yet effective call and return mechanism. A subroutine is invoked via the branch to subroutine (@code{XXX}) or the jump to subroutine (@code{XXX}) instructions. These instructions push the return address on the current stack. The return from subroutine (@code{XXX}) instruction pops the return address off the current stack and transfers control to that instruction. It is important to note that the call and return mechanism for the C3x/C4x does not automatically save or restore any registers. It is the responsibility of the high-level language compiler to define the register preservation and usage convention. XXX other supplements may have "is is". @subsection Calling Mechanism All subroutines are invoked using either a @code{XXX} or @code{XXX} instruction and return to the user application via the @code{XXX} instruction. @subsection Register Usage XXX As discussed above, the @code{XXX} and @code{XXX} instructions do not automatically save any registers. Subroutines use the registers @b{D0}, @b{D1}, @b{A0}, and @b{A1} as scratch registers. These registers are not preserved by subroutines therefore, the contents of these registers should not be assumed upon return from any subroutine call including but not limited to an RTEMS directive. The GNU and Texas Instruments compilers follow the same conventions for register usage. @subsection Parameter Passing Both the GNU and Texas Instruments compilers support two conventions for passing parameters to subroutines. Arguments may be passed in memory on the stack or in registers. @subsubsection Parameters Passed in Memory When passing parameters on the stack, the calling convention assumes that arguments are placed on the current stack before the subroutine is invoked via the @code{XXX} instruction. The first argument is assumed to be closest to the return address on the stack. This means that the first argument of the C calling sequence is pushed last. The following pseudo-code illustrates the typical sequence used to call a subroutine with three (3) arguments: @example @group push third argument push second argument push first argument invoke subroutine remove arguments from the stack @end group @end example The arguments to RTEMS are typically pushed onto the stack using a @code{sti} instruction with a pre-incremented stack pointer as the destination. These arguments must be removed from the stack after control is returned to the caller. This removal is typically accomplished by subtracting the size of the argument list in words from the current stack pointer. @c XXX XXX instruction .. XXX should be code format. With the GNU Compiler Suite, parameter passing via the stack is selected by invoking the compiler with the @code{-mmemparm XXX} argument. This argument must be included when linking the application in order to ensure that support libraries also compiled assuming parameter passing via the stack are used. The default parameter passing mechanism is XXX. When this parameter passing mecahanism is selected, the @code{XXX} symbol is predefined by the C and C++ compilers and the @code{XXX} symbol is predefined by the assembler. This behavior is the same for the GNU and Texas Instruments toolsets. RTEMS uses these predefines to determine how parameters are passed in to those C3x/C4x specific routines that were written in assembly language. @subsubsection Parameters Passed in Registers When passing parameters via registers, the calling convention assumes that the arguments are placed in particular registers based upon their position and data type before the subroutine is invoked via the @code{XXX} instruction. The following pseudo-code illustrates the typical sequence used to call a subroutine with three (3) arguments: @example @group move third argument to XXX move second argument to XXX move first argument to XXX invoke subroutine @end group @end example With the GNU Compiler Suite, parameter passing via registers is selected by invoking the compiler with the @code{-mregparm XXX} argument. This argument must be included when linking the application in order to ensure that support libraries also compiled assuming parameter passing via the stack are used. The default parameter passing mechanism is XXX. When this parameter passing mecahanism is selected, the @code{XXX} symbol is predefined by the C and C++ compilers and the @code{XXX} symbol is predefined by the assembler. This behavior is the same for the GNU and Texas Instruments toolsets. RTEMS uses these predefines to determine how parameters are passed in to those C3x/C4x specific routines that were written in assembly language. @subsection User-Provided Routines All user-provided routines invoked by RTEMS, such as user extensions, device drivers, and MPCI routines, must also adhere to these calling conventions. @c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @section Memory Model A processor may support any combination of memory models ranging from pure physical addressing to complex demand paged virtual memory systems. RTEMS supports a flat memory model which ranges contiguously over the processor's allowable address space. RTEMS does not support segmentation or virtual memory of any kind. The appropriate memory model for RTEMS provided by the targeted processor and related characteristics of that model are described in this chapter. @subsection Byte Addressable versus Word Addressable Processor in the Texas Instruments C3x/C4x family are word addressable. This is in sharp contrast to CISC and RISC processors that are typically byte addressable. In a word addressable architecture, each address points not to an 8-bit byte or octet but to 32 bits. On first glance, byte versus word addressability does not sound like a problem but in fact, this issue can result in subtle problems in high-level language software that is ported to a word addressable processor family. The following is a list of the commonly encountered problems: @table @b @item String Optimizations Although each character in a string occupies a single address just as it does on a byte addressable CPU, each character occupies 32 rather than 8 bits. The most significant 24 bytes are of each address are ignored. This in and of itself does not cause problems but it violates the assumption that two adjacent characters in a string have no intervening bits. This assumption is often implicit in string and memory comparison routines that are optimized to compare 4 adjacent characters with a word oriented operation. This optimization is invalid on word addressable processors. @item Sizeof The C operation @code{sizeof} returns very different results on the C3x/C4x than on traditional RISC/CISC processors. The @code{sizeof(char)}, @code{sizeof(short)}, and @code{sizeof(int)} are all 1 since each occupies a single addressable unit that is thirty-two bits wide. On most thirty-two bit processors, @code{sizeof(char} is one, @code{sizeof(short)} is two, and @code{sizeof(int)} is four. Just as software makes assumptions about the sizes of the primitive data types has problems when ported to a sixty-four bit architecture, these same assumptions cause problems on the C3x/C4x. @item Alignment Since each addressable unit is thirty-two bit wide, there are no alignment restrictions. The native integer type need only be aligned on a "one unit" boundary not a "four unit" boundary as on numerous other processors. @end table @subsection Flat Memory Model XXX check actual bits on the various processor families. The XXX family supports a flat 32-bit address space with addresses ranging from 0x00000000 to 0xFFFFFFFF (4 gigabytes). Each address is represented by a 32-bit value and is byte addressable. The address may be used to reference a single byte, word (2-bytes), or long word (4 bytes). Memory accesses within this address space are performed in big endian fashion by the processors in this family. @subsection Compiler Memory Models The Texas Instruments C3x/C4x processors include a Data Page (@code{dp}) register that logically is a base address. The @code{dp} register allows the use of shorter offsets in instructions. Up to 64K words may be addressed using offsets from the @code{dp} register. In order to address words not addressable based on the current value of @code{dp}, the register must be loaded with a different value. The @code{dp} register is managed automatically by the high-level language compilers. The various compilers for this processor family support two memory models that manage the @code{dp} register in very different manners. The large and small memory models are discussed in the following sections. NOTE: The C3x/C4x port of RTEMS has been written so that it should support either memory model. However, it has only been tested using the large memory model. @subsubsection Small Memory Model The small memory model is the simplest and most efficient. However, it includes a limitation that make it inappropriate for numerous applications. The small memory model assumes that the application needs to access no more than 64K words. Thus the @code{dp} register can be loaded at application start time and never reloaded. Thus the compiler will not even generate instructions to load the @code{dp}. This can significantly reduce the code space required by an application but the application is limited in the amount of data it can access. With the GNU Compiler Suite, small memory model is selected by invoking the compiler with either the @code{-msmall} or @code{-msmallmemoryXXX} argument. This argument must be included when linking the application in order to ensure that support libraries also compiled for the large memory model are used. The default memory model is XXX. When this memory model is selected, the @code{XXX} symbol is predefined by the C and C++ compilers and the @code{XXX} symbol is predefined by the assembler. This behavior is the same for the GNU and Texas Instruments toolsets. RTEMS uses these predefines to determine the proper handling of the @code{dp} register in those C3x/C4x specific routines that were written in assembly language. @subsubsection Large Memory Model The large memory model is more complex and less efficient than the small memory model. However, it removes the 64K uninitialized data restriction from applications. The @code{dp} register is reloaded automatically by the compiler each time data is accessed. This leads to an increase in the code space requirements for the application but gives it access to much more data space. With the GNU Compiler Suite, large memory model is selected by invoking the compiler with either the @code{-mlarge} or @code{-mlargememoryXXX} argument. This argument must be included when linking the application in order to ensure that support libraries also compiled for the large memory model are used. The default memory model is XXX. When this memory model is selected, the @code{XXX} symbol is predefined by the C and C++ compilers and the @code{XXX} symbol is predefined by the assembler. This behavior is the same for the GNU and Texas Instruments toolsets. RTEMS uses these predefines to determine the proper handling of the @code{dp} register in those C3x/C4x specific routines that were written in assembly language. @c @c Interrupt Stack Frame Picture @c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @section Interrupt Processing Different types of processors respond to the occurrence of an interrupt in its own unique fashion. In addition, each processor type provides a control mechanism to allow for the proper handling of an interrupt. The processor dependent response to the interrupt modifies the current execution state and results in a change in the execution stream. Most processors require that an interrupt handler utilize some special control mechanisms to return to the normal processing stream. Although RTEMS hides many of the processor dependent details of interrupt processing, it is important to understand how the RTEMS interrupt manager is mapped onto the processor's unique architecture. Discussed in this chapter are the XXX's interrupt response and control mechanisms as they pertain to RTEMS. @subsection Vectoring of an Interrupt Handler Depending on whether or not the particular CPU supports a separate interrupt stack, the XXX family has two different interrupt handling models. @subsubsection Models Without Separate Interrupt Stacks Upon receipt of an interrupt the XXX family members without separate interrupt stacks automatically perform the following actions: @itemize @bullet @item To Be Written @end itemize @subsubsection Models With Separate Interrupt Stacks Upon receipt of an interrupt the XXX family members with separate interrupt stacks automatically perform the following actions: @itemize @bullet @item saves the current status register (SR), @item clears the master/interrupt (M) bit of the SR to indicate the switch from master state to interrupt state, @item sets the privilege mode to supervisor, @item suppresses tracing, @item sets the interrupt mask level equal to the level of the interrupt being serviced, @item pushes an interrupt stack frame (ISF), which includes the program counter (PC), the status register (SR), and the format/exception vector offset (FVO) word, onto the supervisor and interrupt stacks, @item switches the current stack to the interrupt stack and vectors to an interrupt service routine (ISR). If the ISR was installed with the interrupt_catch directive, then the RTEMS interrupt handler will begin execution. The RTEMS interrupt handler saves all registers which are not preserved according to the calling conventions and invokes the application's ISR. @end itemize A nested interrupt is processed similarly by these CPU models with the exception that only a single ISF is placed on the interrupt stack and the current stack need not be switched. The FVO word in the Interrupt Stack Frame is examined by RTEMS to determine when an outer most interrupt is being exited. Since the FVO is used by RTEMS for this purpose, the user application code MUST NOT modify this field. The following shows the Interrupt Stack Frame for XXX CPU models with separate interrupt stacks: @ifset use-ascii @example @group +----------------------+ | Status Register | 0x0 +----------------------+ | Program Counter High | 0x2 +----------------------+ | Program Counter Low | 0x4 +----------------------+ | Format/Vector Offset | 0x6 +----------------------+ @end group @end example @end ifset @ifset use-tex @sp 1 @tex \centerline{\vbox{\offinterlineskip\halign{ \strut\vrule#& \hbox to 2.00in{\enskip\hfil#\hfil}& \vrule#& \hbox to 0.50in{\enskip\hfil#\hfil} \cr \multispan{3}\hrulefill\cr & Status Register && 0x0\cr \multispan{3}\hrulefill\cr & Program Counter High && 0x2\cr \multispan{3}\hrulefill\cr & Program Counter Low && 0x4\cr \multispan{3}\hrulefill\cr & Format/Vector Offset && 0x6\cr \multispan{3}\hrulefill\cr }}\hfil} @end tex @end ifset @ifset use-html @html
Status Register 0x0
Program Counter High 0x2
Program Counter Low 0x4
Format/Vector Offset 0x6
@end html @end ifset @subsection Interrupt Levels Eight levels (0-7) of interrupt priorities are supported by XXX family members with level seven (7) being the highest priority. Level zero (0) indicates that interrupts are fully enabled. Interrupt requests for interrupts with priorities less than or equal to the current interrupt mask level are ignored. Although RTEMS supports 256 interrupt levels, the XXX family only supports eight. RTEMS interrupt levels 0 through 7 directly correspond to XXX interrupt levels. All other RTEMS interrupt levels are undefined and their behavior is unpredictable. @subsection Disabling of Interrupts by RTEMS During the execution of directive calls, critical sections of code may be executed. When these sections are encountered, RTEMS disables interrupts to level seven (7) before the execution of this section and restores them to the previous level upon completion of the section. RTEMS has been optimized to insure that interrupts are disabled for less than RTEMS_MAXIMUM_DISABLE_PERIOD microseconds on a RTEMS_MAXIMUM_DISABLE_PERIOD_MHZ Mhz XXX with zero wait states. These numbers will vary based the number of wait states and processor speed present on the target board. [NOTE: The maximum period with interrupts disabled is hand calculated. This calculation was last performed for Release RTEMS_RELEASE_FOR_MAXIMUM_DISABLE_PERIOD.] Non-maskable interrupts (NMI) cannot be disabled, and ISRs which execute at this level MUST NEVER issue RTEMS system calls. If a directive is invoked, unpredictable results may occur due to the inability of RTEMS to protect its critical sections. However, ISRs that make no system calls may safely execute as non-maskable interrupts. @subsection Interrupt Stack RTEMS allocates the interrupt stack from the Workspace Area. The amount of memory allocated for the interrupt stack is determined by the interrupt_stack_size field in the CPU Configuration Table. During the initialization process, RTEMS will install its interrupt stack. The XXX port of RTEMS supports a software managed dedicated interrupt stack on those CPU models which do not support a separate interrupt stack in hardware. @c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @section Default Fatal Error Processing Upon detection of a fatal error by either the application or RTEMS the fatal error manager is invoked. The fatal error manager will invoke the user-supplied fatal error handlers. If no user-supplied handlers are configured, the RTEMS provided default fatal error handler is invoked. If the user-supplied fatal error handlers return to the executive the default fatal error handler is then invoked. This chapter describes the precise operations of the default fatal error handler. @subsection Default Fatal Error Handler Operations The default fatal error handler which is invoked by the @code{rtems_fatal_error_occurred} directive when there is no user handler configured or the user handler returns control to RTEMS. The default fatal error handler disables processor interrupts, places the error code in @b{XXX}, and executes a @code{XXX} instruction to simulate a halt processor instruction. @c @c COPYRIGHT (c) 1988-1999. @c On-Line Applications Research Corporation (OAR). @c All rights reserved. @c @c $Id$ @c @section Board Support Packages An RTEMS Board Support Package (BSP) must be designed to support a particular processor and target board combination. This chapter presents a discussion of XXX specific BSP issues. For more information on developing a BSP, refer to the chapter titled Board Support Packages in the RTEMS Applications User's Guide. @subsection System Reset An RTEMS based application is initiated or re-initiated when the XXX processor is reset. When the XXX is reset, the processor performs the following actions: @itemize @bullet @item The tracing bits of the status register are cleared to disable tracing. @item The supervisor interrupt state is entered by setting the supervisor (S) bit and clearing the master/interrupt (M) bit of the status register. @item The interrupt mask of the status register is set to level 7 to effectively disable all maskable interrupts. @item The vector base register (VBR) is set to zero. @item The cache control register (CACR) is set to zero to disable and freeze the processor cache. @item The interrupt stack pointer (ISP) is set to the value stored at vector 0 (bytes 0-3) of the exception vector table (EVT). @item The program counter (PC) is set to the value stored at vector 1 (bytes 4-7) of the EVT. @item The processor begins execution at the address stored in the PC. @end itemize @subsection Processor Initialization The address of the application's initialization code should be stored in the first vector of the EVT which will allow the immediate vectoring to the application code. If the application requires that the VBR be some value besides zero, then it should be set to the required value at this point. All tasks share the same XXX's VBR value. Because interrupts are enabled automatically by RTEMS as part of the initialize executive directive, the VBR MUST be set before this directive is invoked to insure correct interrupt vectoring. If processor caching is to be utilized, then it should be enabled during the reset application initialization code. In addition to the requirements described in the Board Support Packages chapter of the Applications User's Manual for the reset code which is executed before the call to initialize executive, the XXX version has the following specific requirements: @itemize @bullet @item Must leave the S bit of the status register set so that the XXX remains in the supervisor state. @item Must set the M bit of the status register to remove the XXX from the interrupt state. @item Must set the master stack pointer (MSP) such that a minimum stack size of MINIMUM_STACK_SIZE bytes is provided for the initialize executive directive. @item Must initialize the XXX's vector table. @end itemize Note that the BSP is not responsible for allocating or installing the interrupt stack. RTEMS does this automatically as part of initialization. If the BSP does not install an interrupt stack and -- for whatever reason -- an interrupt occurs before initialize_executive is invoked, then the results are unpredictable.