summaryrefslogblamecommitdiffstats
path: root/ada_user/symmetric_multiprocessing_services.rst
blob: 54f84a448bc322ed0dfddb2fbec6ce94528b9e3f (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876











































































































































































































































































































































































































































































































































































































































































































































































































































































































                                                                                                                                                                                                                                
Symmetric Multiprocessing Services
##################################

Introduction
============

The Symmetric Multiprocessing (SMP) support of the RTEMS 4.10.99.0 is
available on

- ARM,

- PowerPC, and

- SPARC.

It must be explicitly enabled via the ``--enable-smp`` configure command
line option.  To enable SMP in the application configuration see `Enable SMP Support for Applications`_.  The default
scheduler for SMP applications supports up to 32 processors and is a global
fixed priority scheduler, see also `Configuring Clustered Schedulers`_.  For example applications see:file:`testsuites/smptests`.

*WARNING: The SMP support in RTEMS is work in progress.  Before you
start using this RTEMS version for SMP ask on the RTEMS mailing list.*

This chapter describes the services related to Symmetric Multiprocessing
provided by RTEMS.

The application level services currently provided are:

- ``rtems_get_processor_count`` - Get processor count

- ``rtems_get_current_processor`` - Get current processor index

- ``rtems_scheduler_ident`` - Get ID of a scheduler

- ``rtems_scheduler_get_processor_set`` - Get processor set of a scheduler

- ``rtems_task_get_scheduler`` - Get scheduler of a task

- ``rtems_task_set_scheduler`` - Set scheduler of a task

- ``rtems_task_get_affinity`` - Get task processor affinity

- ``rtems_task_set_affinity`` - Set task processor affinity

Background
==========

Uniprocessor versus SMP Parallelism
-----------------------------------

Uniprocessor systems have long been used in embedded systems. In this hardware
model, there are some system execution characteristics which have long been
taken for granted:

- one task executes at a time

- hardware events result in interrupts

There is no true parallelism. Even when interrupts appear to occur
at the same time, they are processed in largely a serial fashion.
This is true even when the interupt service routines are allowed to
nest.  From a tasking viewpoint,  it is the responsibility of the real-time
operatimg system to simulate parallelism by switching between tasks.
These task switches occur in response to hardware interrupt events and explicit
application events such as blocking for a resource or delaying.

With symmetric multiprocessing, the presence of multiple processors
allows for true concurrency and provides for cost-effective performance
improvements. Uniprocessors tend to increase performance by increasing
clock speed and complexity. This tends to lead to hot, power hungry
microprocessors which are poorly suited for many embedded applications.

The true concurrency is in sharp contrast to the single task and
interrupt model of uniprocessor systems. This results in a fundamental
change to uniprocessor system characteristics listed above. Developers
are faced with a different set of characteristics which, in turn, break
some existing assumptions and result in new challenges. In an SMP system
with N processors, these are the new execution characteristics.

- N tasks execute in parallel

- hardware events result in interrupts

There is true parallelism with a task executing on each processor and
the possibility of interrupts occurring on each processor. Thus in contrast
to their being one task and one interrupt to consider on a uniprocessor,
there are N tasks and potentially N simultaneous interrupts to consider
on an SMP system.

This increase in hardware complexity and presence of true parallelism
results in the application developer needing to be even more cautious
about mutual exclusion and shared data access than in a uniprocessor
embedded system. Race conditions that never or rarely happened when an
application executed on a uniprocessor system, become much more likely
due to multiple threads executing in parallel. On a uniprocessor system,
these race conditions would only happen when a task switch occurred at
just the wrong moment. Now there are N-1 tasks executing in parallel
all the time and this results in many more opportunities for small
windows in critical sections to be hit.

Task Affinity
-------------
.. index:: task affinity
.. index:: thread affinity

RTEMS provides services to manipulate the affinity of a task. Affinity
is used to specify the subset of processors in an SMP system on which
a particular task can execute.

By default, tasks have an affinity which allows them to execute on any
available processor.

Task affinity is a possible feature to be supported by SMP-aware
schedulers. However, only a subset of the available schedulers support
affinity. Although the behavior is scheduler specific, if the scheduler
does not support affinity, it is likely to ignore all attempts to set
affinity.

The scheduler with support for arbitary processor affinities uses a proof of
concept implementation.  See https://devel.rtems.org/ticket/2510.

Task Migration
--------------
.. index:: task migration
.. index:: thread migration

With more than one processor in the system tasks can migrate from one processor
to another.  There are three reasons why tasks migrate in RTEMS.

- The scheduler changes explicitly via ``rtems_task_set_scheduler()`` or
  similar directives.

- The task resumes execution after a blocking operation.  On a priority
  based scheduler it will evict the lowest priority task currently assigned to a
  processor in the processor set managed by the scheduler instance.

- The task moves temporarily to another scheduler instance due to locking
  protocols like *Migratory Priority Inheritance* or the*Multiprocessor Resource Sharing Protocol*.

Task migration should be avoided so that the working set of a task can stay on
the most local cache level.

The current implementation of task migration in RTEMS has some implications
with respect to the interrupt latency.  It is crucial to preserve the system
invariant that a task can execute on at most one processor in the system at a
time.  This is accomplished with a boolean indicator in the task context.  The
processor architecture specific low-level task context switch code will mark
that a task context is no longer executing and waits that the heir context
stopped execution before it restores the heir context and resumes execution of
the heir task.  So there is one point in time in which a processor is without a
task.  This is essential to avoid cyclic dependencies in case multiple tasks
migrate at once.  Otherwise some supervising entity is necessary to prevent
life-locks.  Such a global supervisor would lead to scalability problems so
this approach is not used.  Currently the thread dispatch is performed with
interrupts disabled.  So in case the heir task is currently executing on
another processor then this prolongs the time of disabled interrupts since one
processor has to wait for another processor to make progress.

It is difficult to avoid this issue with the interrupt latency since interrupts
normally store the context of the interrupted task on its stack.  In case a
task is marked as not executing we must not use its task stack to store such an
interrupt context.  We cannot use the heir stack before it stopped execution on
another processor.  So if we enable interrupts during this transition we have
to provide an alternative task independent stack for this time frame.  This
issue needs further investigation.

Clustered Scheduling
--------------------

We have clustered scheduling in case the set of processors of a system is
partitioned into non-empty pairwise-disjoint subsets. These subsets are called
clusters.  Clusters with a cardinality of one are partitions.  Each cluster is
owned by exactly one scheduler instance.

Clustered scheduling helps to control the worst-case latencies in
multi-processor systems, see *Brandenburg, Björn B.: Scheduling and
Locking in Multiprocessor Real-Time Operating Systems. PhD thesis, 2011.http://www.cs.unc.edu/~bbb/diss/brandenburg-diss.pdf*.  The goal is to
reduce the amount of shared state in the system and thus prevention of lock
contention. Modern multi-processor systems tend to have several layers of data
and instruction caches.  With clustered scheduling it is possible to honour the
cache topology of a system and thus avoid expensive cache synchronization
traffic.  It is easy to implement.  The problem is to provide synchronization
primitives for inter-cluster synchronization (more than one cluster is involved
in the synchronization process). In RTEMS there are currently four means
available

- events,

- message queues,

- semaphores using the `Priority Inheritance`_
  protocol (priority boosting), and

- semaphores using the `Multiprocessor Resource Sharing Protocol`_ (MrsP).

The clustered scheduling approach enables separation of functions with
real-time requirements and functions that profit from fairness and high
throughput provided the scheduler instances are fully decoupled and adequate
inter-cluster synchronization primitives are used.  This is work in progress.

For the configuration of clustered schedulers see `Configuring Clustered Schedulers`_.

To set the scheduler of a task see `SCHEDULER_IDENT - Get ID of a scheduler`_
 and `TASK_SET_SCHEDULER - Set scheduler of a task`_.

Task Priority Queues
--------------------

Due to the support for clustered scheduling the task priority queues need
special attention.  It makes no sense to compare the priority values of two
different scheduler instances.  Thus, it is not possible to simply use one
plain priority queue for tasks of different scheduler instances.

One solution to this problem is to use two levels of queues.  The top level
queue provides FIFO ordering and contains priority queues.  Each priority queue
is associated with a scheduler instance and contains only tasks of this
scheduler instance.  Tasks are enqueued in the priority queue corresponding to
their scheduler instance.  In case this priority queue was empty, then it is
appended to the FIFO.  To dequeue a task the highest priority task of the first
priority queue in the FIFO is selected.  Then the first priority queue is
removed from the FIFO.  In case the previously first priority queue is not
empty, then it is appended to the FIFO.  So there is FIFO fairness with respect
to the highest priority task of each scheduler instances. See also *Brandenburg, Björn B.: A fully preemptive multiprocessor semaphore protocol for
latency-sensitive real-time applications. In Proceedings of the 25th Euromicro
Conference on Real-Time Systems (ECRTS 2013), pages 292–302, 2013.http://www.mpi-sws.org/~bbb/papers/pdf/ecrts13b.pdf*.

Such a two level queue may need a considerable amount of memory if fast enqueue
and dequeue operations are desired (depends on the scheduler instance count).
To mitigate this problem an approch of the FreeBSD kernel was implemented in
RTEMS.  We have the invariant that a task can be enqueued on at most one task
queue.  Thus, we need only as many queues as we have tasks.  Each task is
equipped with spare task queue which it can give to an object on demand.  The
task queue uses a dedicated memory space independent of the other memory used
for the task itself. In case a task needs to block, then there are two options

- the object already has task queue, then the task enqueues itself to this
  already present queue and the spare task queue of the task is added to a list
  of free queues for this object, or

- otherwise, then the queue of the task is given to the object and the task
  enqueues itself to this queue.

In case the task is dequeued, then there are two options

- the task is the last task on the queue, then it removes this queue from
  the object and reclaims it for its own purpose, or

- otherwise, then the task removes one queue from the free list of the
  object and reclaims it for its own purpose.

Since there are usually more objects than tasks, this actually reduces the
memory demands. In addition the objects contain only a pointer to the task
queue structure. This helps to hide implementation details and makes it
possible to use self-contained synchronization objects in Newlib and GCC (C++
and OpenMP run-time support).

Scheduler Helping Protocol
--------------------------

The scheduler provides a helping protocol to support locking protocols like*Migratory Priority Inheritance* or the *Multiprocessor Resource
Sharing Protocol*.  Each ready task can use at least one scheduler node at a
time to gain access to a processor.  Each scheduler node has an owner, a user
and an optional idle task.  The owner of a scheduler node is determined a task
creation and never changes during the life time of a scheduler node.  The user
of a scheduler node may change due to the scheduler helping protocol.  A
scheduler node is in one of the four scheduler help states:

:dfn:`help yourself`
    This scheduler node is solely used by the owner task.  This task owns no
    resources using a helping protocol and thus does not take part in the scheduler
    helping protocol.  No help will be provided for other tasks.

:dfn:`help active owner`
    This scheduler node is owned by a task actively owning a resource and can be
    used to help out tasks.
    In case this scheduler node changes its state from ready to scheduled and the
    task executes using another node, then an idle task will be provided as a user
    of this node to temporarily execute on behalf of the owner task.  Thus lower
    priority tasks are denied access to the processors of this scheduler instance.
    In case a task actively owning a resource performs a blocking operation, then
    an idle task will be used also in case this node is in the scheduled state.

:dfn:`help active rival`
    This scheduler node is owned by a task actively obtaining a resource currently
    owned by another task and can be used to help out tasks.
    The task owning this node is ready and will give away its processor in case the
    task owning the resource asks for help.

:dfn:`help passive`
    This scheduler node is owned by a task obtaining a resource currently owned by
    another task and can be used to help out tasks.
    The task owning this node is blocked.

The following scheduler operations return a task in need for help

- unblock,

- change priority,

- yield, and

- ask for help.

A task in need for help is a task that encounters a scheduler state change from
scheduled to ready (this is a pre-emption by a higher priority task) or a task
that cannot be scheduled in an unblock operation.  Such a task can ask tasks
which depend on resources owned by this task for help.

In case it is not possible to schedule a task in need for help, then the
scheduler nodes available for the task will be placed into the set of ready
scheduler nodes of the corresponding scheduler instances.  Once a state change
from ready to scheduled happens for one of scheduler nodes it will be used to
schedule the task in need for help.

The ask for help scheduler operation is used to help tasks in need for help
returned by the operations mentioned above.  This operation is also used in
case the root of a resource sub-tree owned by a task changes.

The run-time of the ask for help procedures depend on the size of the resource
tree of the task needing help and other resource trees in case tasks in need
for help are produced during this operation.  Thus the worst-case latency in
the system depends on the maximum resource tree size of the application.

Critical Section Techniques and SMP
-----------------------------------

As discussed earlier, SMP systems have opportunities for true parallelism
which was not possible on uniprocessor systems. Consequently, multiple
techniques that provided adequate critical sections on uniprocessor
systems are unsafe on SMP systems. In this section, some of these
unsafe techniques will be discussed.

In general, applications must use proper operating system provided mutual
exclusion mechanisms to ensure correct behavior. This primarily means
the use of binary semaphores or mutexes to implement critical sections.

Disable Interrupts and Interrupt Locks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A low overhead means to ensure mutual exclusion in uni-processor configurations
is to disable interrupts around a critical section.  This is commonly used in
device driver code and throughout the operating system core.  On SMP
configurations, however, disabling the interrupts on one processor has no
effect on other processors.  So, this is insufficient to ensure system wide
mutual exclusion.  The macros

- ``rtems_interrupt_disable()``,

- ``rtems_interrupt_enable()``, and

- ``rtems_interrupt_flush()``

are disabled on SMP configurations and its use will lead to compiler warnings
and linker errors.  In the unlikely case that interrupts must be disabled on
the current processor, then the

- ``rtems_interrupt_local_disable()``, and

- ``rtems_interrupt_local_enable()``

macros are now available in all configurations.

Since disabling of interrupts is not enough to ensure system wide mutual
exclusion on SMP, a new low-level synchronization primitive was added - the
interrupt locks.  They are a simple API layer on top of the SMP locks used for
low-level synchronization in the operating system core.  Currently they are
implemented as a ticket lock.  On uni-processor configurations they degenerate
to simple interrupt disable/enable sequences.  It is disallowed to acquire a
single interrupt lock in a nested way.  This will result in an infinite loop
with interrupts disabled.  While converting legacy code to interrupt locks care
must be taken to avoid this situation.
.. code:: c

    void legacy_code_with_interrupt_disable_enable( void )
    {
    rtems_interrupt_level level;
    rtems_interrupt_disable( level );
    /* Some critical stuff \*/
    rtems_interrupt_enable( level );
    }
    RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" )
    void smp_ready_code_with_interrupt_lock( void )
    {
    rtems_interrupt_lock_context lock_context;
    rtems_interrupt_lock_acquire( &lock, &lock_context );
    /* Some critical stuff \*/
    rtems_interrupt_lock_release( &lock, &lock_context );
    }

The ``rtems_interrupt_lock`` structure is empty on uni-processor
configurations.  Empty structures have a different size in C
(implementation-defined, zero in case of GCC) and C++ (implementation-defined
non-zero value, one in case of GCC).  Thus the``RTEMS_INTERRUPT_LOCK_DECLARE()``, ``RTEMS_INTERRUPT_LOCK_DEFINE()``,``RTEMS_INTERRUPT_LOCK_MEMBER()``, and``RTEMS_INTERRUPT_LOCK_REFERENCE()`` macros are provided to ensure ABI
compatibility.

Highest Priority Task Assumption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On a uniprocessor system, it is safe to assume that when the highest
priority task in an application executes, it will execute without being
preempted until it voluntarily blocks. Interrupts may occur while it is
executing, but there will be no context switch to another task unless
the highest priority task voluntarily initiates it.

Given the assumption that no other tasks will have their execution
interleaved with the highest priority task, it is possible for this
task to be constructed such that it does not need to acquire a binary
semaphore or mutex for protected access to shared data.

In an SMP system, it cannot be assumed there will never be a single task
executing. It should be assumed that every processor is executing another
application task. Further, those tasks will be ones which would not have
been executed in a uniprocessor configuration and should be assumed to
have data synchronization conflicts with what was formerly the highest
priority task which executed without conflict.

Disable Preemption
~~~~~~~~~~~~~~~~~~

On a uniprocessor system, disabling preemption in a task is very similar
to making the highest priority task assumption. While preemption is
disabled, no task context switches will occur unless the task initiates
them voluntarily. And, just as with the highest priority task assumption,
there are N-1 processors also running tasks. Thus the assumption that no
other tasks will run while the task has preemption disabled is violated.

Task Unique Data and SMP
------------------------

Per task variables are a service commonly provided by real-time operating
systems for application use. They work by allowing the application
to specify a location in memory (typically a ``void *``) which is
logically added to the context of a task. On each task switch, the
location in memory is stored and each task can have a unique value in
the same memory location. This memory location is directly accessed as a
variable in a program.

This works well in a uniprocessor environment because there is one task
executing and one memory location containing a task-specific value. But
it is fundamentally broken on an SMP system because there are always N
tasks executing. With only one location in memory, N-1 tasks will not
have the correct value.

This paradigm for providing task unique data values is fundamentally
broken on SMP systems.

Classic API Per Task Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The Classic API provides three directives to support per task variables. These are:

- ``rtems.task_variable_add`` - Associate per task variable

- ``rtems.task_variable_get`` - Obtain value of a a per task variable

- ``rtems.task_variable_delete`` - Remove per task variable

As task variables are unsafe for use on SMP systems, the use of these services
must be eliminated in all software that is to be used in an SMP environment.
The task variables API is disabled on SMP. Its use will lead to compile-time
and link-time errors. It is recommended that the application developer consider
the use of POSIX Keys or Thread Local Storage (TLS). POSIX Keys are available
in all RTEMS configurations.  For the availablity of TLS on a particular
architecture please consult the *RTEMS CPU Architecture Supplement*.

The only remaining user of task variables in the RTEMS code base is the Ada
support.  So basically Ada is not available on RTEMS SMP.

OpenMP
------

OpenMP support for RTEMS is available via the GCC provided libgomp.  There is
libgomp support for RTEMS in the POSIX configuration of libgomp since GCC 4.9
(requires a Newlib snapshot after 2015-03-12). In GCC 6.1 or later (requires a
Newlib snapshot after 2015-07-30 for <sys/lock.h> provided self-contained
synchronization objects) there is a specialized libgomp configuration for RTEMS
which offers a significantly better performance compared to the POSIX
configuration of libgomp.  In addition application configurable thread pools
for each scheduler instance are available in GCC 6.1 or later.

The run-time configuration of libgomp is done via environment variables
documented in the `libgomp
manual <https://gcc.gnu.org/onlinedocs/libgomp/>`_.  The environment variables are evaluated in a constructor function
which executes in the context of the first initialization task before the
actual initialization task function is called (just like a global C++
constructor).  To set application specific values, a higher priority
constructor function must be used to set up the environment variables.
.. code:: c

    #include <stdlib.h>
    void __attribute__((constructor(1000))) config_libgomp( void )
    {
    setenv( "OMP_DISPLAY_ENV", "VERBOSE", 1 );
    setenv( "GOMP_SPINCOUNT", "30000", 1 );
    setenv( "GOMP_RTEMS_THREAD_POOLS", "1$2@SCHD", 1 );
    }

The environment variable ``GOMP_RTEMS_THREAD_POOLS`` is RTEMS-specific.  It
determines the thread pools for each scheduler instance.  The format for``GOMP_RTEMS_THREAD_POOLS`` is a list of optional``<thread-pool-count>[$<priority>]@<scheduler-name>`` configurations
separated by ``:`` where:

- ``<thread-pool-count>`` is the thread pool count for this scheduler
  instance.

- ``$<priority>`` is an optional priority for the worker threads of a
  thread pool according to ``pthread_setschedparam``.  In case a priority
  value is omitted, then a worker thread will inherit the priority of the OpenMP
  master thread that created it.  The priority of the worker thread is not
  changed by libgomp after creation, even if a new OpenMP master thread using the
  worker has a different priority.

- ``@<scheduler-name>`` is the scheduler instance name according to the
  RTEMS application configuration.

In case no thread pool configuration is specified for a scheduler instance,
then each OpenMP master thread of this scheduler instance will use its own
dynamically allocated thread pool.  To limit the worker thread count of the
thread pools, each OpenMP master thread must call ``omp_set_num_threads``.

Lets suppose we have three scheduler instances ``IO``, ``WRK0``, and``WRK1`` with ``GOMP_RTEMS_THREAD_POOLS`` set to``"1@WRK0:3$4@WRK1"``.  Then there are no thread pool restrictions for
scheduler instance ``IO``.  In the scheduler instance ``WRK0`` there is
one thread pool available.  Since no priority is specified for this scheduler
instance, the worker thread inherits the priority of the OpenMP master thread
that created it.  In the scheduler instance ``WRK1`` there are three thread
pools available and their worker threads run at priority four.

Thread Dispatch Details
-----------------------

This section gives background information to developers interested in the
interrupt latencies introduced by thread dispatching.  A thread dispatch
consists of all work which must be done to stop the currently executing thread
on a processor and hand over this processor to an heir thread.

On SMP systems, scheduling decisions on one processor must be propagated to
other processors through inter-processor interrupts.  So, a thread dispatch
which must be carried out on another processor happens not instantaneous.  Thus
several thread dispatch requests might be in the air and it is possible that
some of them may be out of date before the corresponding processor has time to
deal with them.  The thread dispatch mechanism uses three per-processor
variables,

- the executing thread,

- the heir thread, and

- an boolean flag indicating if a thread dispatch is necessary or not.

Updates of the heir thread and the thread dispatch necessary indicator are
synchronized via explicit memory barriers without the use of locks.  A thread
can be an heir thread on at most one processor in the system.  The thread context
is protected by a TTAS lock embedded in the context to ensure that it is used
on at most one processor at a time.  The thread post-switch actions use a
per-processor lock.  This implementation turned out to be quite efficient and
no lock contention was observed in the test suite.

The current implementation of thread dispatching has some implications with
respect to the interrupt latency.  It is crucial to preserve the system
invariant that a thread can execute on at most one processor in the system at a
time.  This is accomplished with a boolean indicator in the thread context.
The processor architecture specific context switch code will mark that a thread
context is no longer executing and waits that the heir context stopped
execution before it restores the heir context and resumes execution of the heir
thread (the boolean indicator is basically a TTAS lock).  So, there is one
point in time in which a processor is without a thread.  This is essential to
avoid cyclic dependencies in case multiple threads migrate at once.  Otherwise
some supervising entity is necessary to prevent deadlocks.  Such a global
supervisor would lead to scalability problems so this approach is not used.
Currently the context switch is performed with interrupts disabled.  Thus in
case the heir thread is currently executing on another processor, the time of
disabled interrupts is prolonged since one processor has to wait for another
processor to make progress.

It is difficult to avoid this issue with the interrupt latency since interrupts
normally store the context of the interrupted thread on its stack.  In case a
thread is marked as not executing, we must not use its thread stack to store
such an interrupt context.  We cannot use the heir stack before it stopped
execution on another processor.  If we enable interrupts during this
transition, then we have to provide an alternative thread independent stack for
interrupts in this time frame.  This issue needs further investigation.

The problematic situation occurs in case we have a thread which executes with
thread dispatching disabled and should execute on another processor (e.g. it is
an heir thread on another processor).  In this case the interrupts on this
other processor are disabled until the thread enables thread dispatching and
starts the thread dispatch sequence.  The scheduler (an exception is the
scheduler with thread processor affinity support) tries to avoid such a
situation and checks if a new scheduled thread already executes on a processor.
In case the assigned processor differs from the processor on which the thread
already executes and this processor is a member of the processor set managed by
this scheduler instance, it will reassign the processors to keep the already
executing thread in place.  Therefore normal scheduler requests will not lead
to such a situation.  Explicit thread migration requests, however, can lead to
this situation.  Explicit thread migrations may occur due to the scheduler
helping protocol or explicit scheduler instance changes.  The situation can
also be provoked by interrupts which suspend and resume threads multiple times
and produce stale asynchronous thread dispatch requests in the system.

Operations
==========

Setting Affinity to a Single Processor
--------------------------------------

On some embedded applications targeting SMP systems, it may be beneficial to
lock individual tasks to specific processors.  In this way, one can designate a
processor for I/O tasks, another for computation, etc..  The following
illustrates the code sequence necessary to assign a task an affinity for
processor with index ``processor_index``.
.. code:: c

    #include <rtems.h>
    #include <assert.h>
    void pin_to_processor(rtems_id task_id, int processor_index)
    {
    rtems_status_code sc;
    cpu_set_t         cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(processor_index, &cpuset);
    sc = rtems_task_set_affinity(task_id, sizeof(cpuset), &cpuset);
    assert(sc == RTEMS_SUCCESSFUL);
    }

It is important to note that the ``cpuset`` is not validated until the``rtems.task_set_affinity`` call is made. At that point,
it is validated against the current system configuration.

Directives
==========

This section details the symmetric multiprocessing services.  A subsection
is dedicated to each of these services and describes the calling sequence,
related constants, usage, and status codes.

.. COMMENT: rtems_get_processor_count

GET_PROCESSOR_COUNT - Get processor count
-----------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

The count of processors in the system.

**DESCRIPTION:**

On uni-processor configurations a value of one will be returned.

On SMP configurations this returns the value of a global variable set during
system initialization to indicate the count of utilized processors.  The
processor count depends on the physically or virtually available processors and
application configuration.  The value will always be less than or equal to the
maximum count of application configured processors.

**NOTES:**

None.

.. COMMENT: rtems_get_current_processor

GET_CURRENT_PROCESSOR - Get current processor index
---------------------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

The index of the current processor.

**DESCRIPTION:**

On uni-processor configurations a value of zero will be returned.

On SMP configurations an architecture specific method is used to obtain the
index of the current processor in the system.  The set of processor indices is
the range of integers starting with zero up to the processor count minus one.

Outside of sections with disabled thread dispatching the current processor
index may change after every instruction since the thread may migrate from one
processor to another.  Sections with disabled interrupts are sections with
thread dispatching disabled.

**NOTES:**

None.

.. COMMENT: rtems_scheduler_ident


SCHEDULER_IDENT - Get ID of a scheduler
---------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

``RTEMS.SUCCESSFUL`` - successful operation
``RTEMS.INVALID_ADDRESS`` - ``id`` is NULL
``RTEMS.INVALID_NAME`` - invalid scheduler name
``RTEMS.UNSATISFIED`` - - a scheduler with this name exists, but
the processor set of this scheduler is empty

**DESCRIPTION:**

Identifies a scheduler by its name.  The scheduler name is determined by the
scheduler configuration.  See `Configuring Clustered Schedulers`_.

**NOTES:**

None.

.. COMMENT: rtems_scheduler_get_processor_set

SCHEDULER_GET_PROCESSOR_SET - Get processor set of a scheduler
--------------------------------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

``RTEMS.SUCCESSFUL`` - successful operation
``RTEMS.INVALID_ADDRESS`` - ``cpuset`` is NULL
``RTEMS.INVALID_ID`` - invalid scheduler id
``RTEMS.INVALID_NUMBER`` - the affinity set buffer is too small for
set of processors owned by the scheduler

**DESCRIPTION:**

Returns the processor set owned by the scheduler in ``cpuset``.  A set bit
in the processor set means that this processor is owned by the scheduler and a
cleared bit means the opposite.

**NOTES:**

None.

.. COMMENT: rtems_task_get_scheduler

TASK_GET_SCHEDULER - Get scheduler of a task
--------------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

``RTEMS.SUCCESSFUL`` - successful operation
``RTEMS.INVALID_ADDRESS`` - ``scheduler_id`` is NULL
``RTEMS.INVALID_ID`` - invalid task id

**DESCRIPTION:**

Returns the scheduler identifier of a task identified by ``task_id`` in``scheduler_id``.

**NOTES:**

None.

.. COMMENT: rtems_task_set_scheduler


TASK_SET_SCHEDULER - Set scheduler of a task
--------------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

``RTEMS.SUCCESSFUL`` - successful operation
``RTEMS.INVALID_ID`` - invalid task or scheduler id
``RTEMS.INCORRECT_STATE`` - the task is in the wrong state to
perform a scheduler change

**DESCRIPTION:**

Sets the scheduler of a task identified by ``task_id`` to the scheduler
identified by ``scheduler_id``.  The scheduler of a task is initialized to
the scheduler of the task that created it.

**NOTES:**

None.

**EXAMPLE:**

.. code:: c

    #include <rtems.h>
    #include <assert.h>
    void task(rtems_task_argument arg);
    void example(void)
    {
    rtems_status_code sc;
    rtems_id          task_id;
    rtems_id          scheduler_id;
    rtems_name        scheduler_name;
    scheduler_name = rtems_build_name('W', 'O', 'R', 'K');
    sc = rtems_scheduler_ident(scheduler_name, &scheduler_id);
    assert(sc == RTEMS_SUCCESSFUL);
    sc = rtems_task_create(
    rtems_build_name('T', 'A', 'S', 'K'),
    1,
    RTEMS_MINIMUM_STACK_SIZE,
    RTEMS_DEFAULT_MODES,
    RTEMS_DEFAULT_ATTRIBUTES,
    &task_id
    );
    assert(sc == RTEMS_SUCCESSFUL);
    sc = rtems_task_set_scheduler(task_id, scheduler_id);
    assert(sc == RTEMS_SUCCESSFUL);
    sc = rtems_task_start(task_id, task, 0);
    assert(sc == RTEMS_SUCCESSFUL);
    }

.. COMMENT: rtems_task_get_affinity

TASK_GET_AFFINITY - Get task processor affinity
-----------------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

``RTEMS.SUCCESSFUL`` - successful operation
``RTEMS.INVALID_ADDRESS`` - ``cpuset`` is NULL
``RTEMS.INVALID_ID`` - invalid task id
``RTEMS.INVALID_NUMBER`` - the affinity set buffer is too small for
the current processor affinity set of the task

**DESCRIPTION:**

Returns the current processor affinity set of the task in ``cpuset``.  A set
bit in the affinity set means that the task can execute on this processor and a
cleared bit means the opposite.

**NOTES:**

None.

.. COMMENT: rtems_task_set_affinity

TASK_SET_AFFINITY - Set task processor affinity
-----------------------------------------------

**CALLING SEQUENCE:**

**DIRECTIVE STATUS CODES:**

``RTEMS.SUCCESSFUL`` - successful operation
``RTEMS.INVALID_ADDRESS`` - ``cpuset`` is NULL
``RTEMS.INVALID_ID`` - invalid task id
``RTEMS.INVALID_NUMBER`` - invalid processor affinity set

**DESCRIPTION:**

Sets the processor affinity set for the task specified by ``cpuset``.  A set
bit in the affinity set means that the task can execute on this processor and a
cleared bit means the opposite.

**NOTES:**

This function will not change the scheduler of the task.  The intersection of
the processor affinity set and the set of processors owned by the scheduler of
the task must be non-empty.  It is not an error if the processor affinity set
contains processors that are not part of the set of processors owned by the
scheduler instance of the task.  A task will simply not run under normal
circumstances on these processors since the scheduler ignores them.  Some
locking protocols may temporarily use processors that are not included in the
processor affinity set of the task.  It is also not an error if the processor
affinity set contains processors that are not part of the system.

.. COMMENT: COPYRIGHT (c) 2011,2015

.. COMMENT: Aeroflex Gaisler AB

.. COMMENT: All rights reserved.