API summary
       The Linux scheduling APIs are as follows:

              Set the scheduling policy and parameters of a specified thread.

              Return the scheduling policy of a specified thread.

              Set the scheduling parameters of a specified thread.

              Fetch the scheduling parameters of a specified thread.

              Return  the maximum priority available in a specified scheduling

              Return the minimum priority available in a specified  scheduling

              Fetch  the quantum used for threads that are scheduled under the
              "round-robin" scheduling policy.

              Cause the caller to relinquish  the  CPU,  so  that  some  other
              thread be executed.

              (Linux-specific) Set the CPU affinity of a specified thread.

              (Linux-specific) Get the CPU affinity of a specified thread.

              Set  the scheduling policy and parameters of a specified thread.
              This (Linux-specific) system call provides  a  superset  of  the
              functionality of sched_setscheduler(2) and sched_setparam(2).

              Fetch  the  scheduling  policy  and  parameters  of  a specified
              thread.  This (Linux-specific) system call provides  a  superset
              of  the  functionality  of  sched_getscheduler(2) and sched_get-

   Scheduling policies
       The scheduler is the  kernel  component  that  decides  which  runnable
       thread will be executed by the CPU next.  Each thread has an associated
       scheduling policy and a  static  scheduling  priority,  sched_priority.
       The  scheduler makes its decisions based on knowledge of the scheduling
       policy and static priority of all threads on the system.
       ity_max(2)  to  find the range of priorities supported for a particular

       Conceptually, the scheduler maintains a list of  runnable  threads  for
       each possible sched_priority value.  In order to determine which thread
       runs next, the scheduler looks for the nonempty list with  the  highest
       static priority and selects the thread at the head of this list.

       A  thread's scheduling policy determines where it will be inserted into
       the list of threads with equal static priority and  how  it  will  move
       inside this list.

       All scheduling is preemptive: if a thread with a higher static priority
       becomes ready to run, the currently running thread  will  be  preempted
       and  returned  to  the  wait  list  for its static priority level.  The
       scheduling policy determines the  ordering  only  within  the  list  of
       runnable threads with equal static priority.

   SCHED_FIFO: First in-first out scheduling
       SCHED_FIFO can be used only with static priorities higher than 0, which
       means that when a SCHED_FIFO threads becomes runnable, it  will  always
       immediately  preempt any currently running SCHED_OTHER, SCHED_BATCH, or
       SCHED_IDLE thread.  SCHED_FIFO is a simple scheduling algorithm without
       time  slicing.   For threads scheduled under the SCHED_FIFO policy, the
       following rules apply:

       *  A SCHED_FIFO thread that has been preempted  by  another  thread  of
          higher  priority  will stay at the head of the list for its priority
          and will resume execution as soon as all threads of higher  priority
          are blocked again.

       *  When  a  SCHED_FIFO  thread becomes runnable, it will be inserted at
          the end of the list for its priority.

       *  A   call    to    sched_setscheduler(2),    sched_setparam(2),    or
          sched_setattr(2)  will put the SCHED_FIFO (or SCHED_RR) thread iden-
          tified by pid at the start of the list if it  was  runnable.   As  a
          consequence,  it  may preempt the currently running thread if it has
          the same priority.  (POSIX.1 specifies that the thread should go  to
          the end of the list.)

       *  A thread calling sched_yield(2) will be put at the end of the list.

       No  other events will move a thread scheduled under the SCHED_FIFO pol-
       icy in the wait list of runnable threads with equal static priority.

       A SCHED_FIFO thread runs until either it is blocked by an I/O  request,
       it   is   preempted   by   a   higher  priority  thread,  or  it  calls

   SCHED_RR: Round-robin scheduling
       SCHED_RR is a simple enhancement of SCHED_FIFO.   Everything  described
       above  for SCHED_FIFO also applies to SCHED_RR, except that each thread
       is allowed to run only for a  maximum  time  quantum.   If  a  SCHED_RR
       attributes,  one  must  use  the  Linux-specific  sched_setattr(2)  and
       sched_getattr(2) system calls.

       A  sporadic  task is one that has a sequence of jobs, where each job is
       activated at most once per period.  Each job also has a relative  dead-
       line,  before which it should finish execution, and a computation time,
       which is the CPU time necessary for executing the job.  The moment when
       a  task  wakes  up  because  a new job has to be executed is called the
       arrival time (also referred to as the request time  or  release  time).
       The  start  time is the time at which a task starts its execution.  The
       absolute deadline is thus obtained by adding the relative  deadline  to
       the arrival time.

       The following diagram clarifies these terms:

           arrival/wakeup                    absolute deadline
                |    start time                    |
                |        |                         |
                v        v                         v
                         |<- comp. time ->|
                |<------- relative deadline ------>|
                |<-------------- period ------------------->|

       When   setting   a   SCHED_DEADLINE   policy   for   a   thread   using
       sched_setattr(2), one can specify three parameters: Runtime,  Deadline,
       and  Period.   These  parameters  do  not necessarily correspond to the
       aforementioned terms: usual practice is to  set  Runtime  to  something
       bigger  than the average computation time (or worst-case execution time
       for hard real-time tasks),  Deadline  to  the  relative  deadline,  and
       Period to the period of the task.  Thus, for SCHED_DEADLINE scheduling,
       we have:

           arrival/wakeup                    absolute deadline
                |    start time                    |
                |        |                         |
                v        v                         v
                         |<-- Runtime ------->|
                |<----------- Deadline ----------->|
                |<-------------- Period ------------------->|

       The three deadline-scheduling parameters correspond to  the  sched_run-
       time,  sched_deadline, and sched_period fields of the sched_attr struc-
       ture; see sched_setattr(2).  These fields express  values  in  nanosec-
       onds.   If  sched_period is specified as 0, then it is made the same as

       The kernel requires that:

           sched_runtime <= sched_deadline <= sched_period

       In addition, under the current implementation,  all  of  the  parameter
       values must be at least 1024 (i.e., just over one microsecond, which is

       For  example,  it  is required (but not necessarily sufficient) for the
       total utilization to be less than or equal to the total number of  CPUs
       available,  where,  since each thread can maximally run for Runtime per
       Period, that thread's utilization is its Runtime divided by its Period.

       In order to fulfil the guarantees that are made when a thread is admit-
       ted  to the SCHED_DEADLINE policy, SCHED_DEADLINE threads are the high-
       est  priority  (user  controllable)  threads  in  the  system;  if  any
       SCHED_DEADLINE thread is runnable, it will preempt any thread scheduled
       under one of the other policies.

       A call to fork(2) by a thread scheduled under the SCHED_DEADLINE policy
       will  fail  with  the error EAGAIN, unless the thread has its reset-on-
       fork flag set (see below).

       A SCHED_DEADLINE thread that calls sched_yield(2) will yield  the  cur-
       rent job and wait for a new period to begin.

   SCHED_OTHER: Default Linux time-sharing scheduling
       SCHED_OTHER  can be used at only static priority 0.  SCHED_OTHER is the
       standard Linux time-sharing scheduler that is intended for all  threads
       that  do  not  require the special real-time mechanisms.  The thread to
       run is chosen from the static priority 0 list based on a dynamic prior-
       ity  that is determined only inside this list.  The dynamic priority is
       based  on  the  nice  value  (set  by   nice(2),   setpriority(2),   or
       sched_setattr(2))  and  increased  for  each time quantum the thread is
       ready to run, but denied to run by the scheduler.   This  ensures  fair
       progress among all SCHED_OTHER threads.

   SCHED_BATCH: Scheduling batch processes
       (Since  Linux 2.6.16.)  SCHED_BATCH can be used only at static priority
       0.  This policy is similar to SCHED_OTHER  in  that  it  schedules  the
       thread  according  to  its  dynamic priority (based on the nice value).
       The difference is that this policy will cause the scheduler  to  always
       assume  that  the thread is CPU-intensive.  Consequently, the scheduler
       will apply a small scheduling penalty with respect to wakeup  behavior,
       so that this thread is mildly disfavored in scheduling decisions.

       This policy is useful for workloads that are noninteractive, but do not
       want to lower their nice value, and for workloads that want a determin-
       istic scheduling policy without interactivity causing extra preemptions
       (between the workload's tasks).

   SCHED_IDLE: Scheduling very low priority jobs
       (Since Linux 2.6.23.)  SCHED_IDLE can be used only at  static  priority
       0; the process nice value has no influence for this policy.

       This  policy  is  intended  for  running jobs at extremely low priority
       (lower even than a +19 nice value with the SCHED_OTHER  or  SCHED_BATCH

   Resetting scheduling policy for child processes
       Each  thread  has  a  reset-on-fork scheduling flag.  When this flag is

       The  reset-on-fork feature is intended for media-playback applications,
       and can be used  to  prevent  applications  evading  the  RLIMIT_RTTIME
       resource limit (see getrlimit(2)) by creating multiple child processes.

       More  precisely,  if the reset-on-fork flag is set, the following rules
       apply for subsequently created children:

       *  If the calling thread has  a  scheduling  policy  of  SCHED_FIFO  or
          SCHED_RR, the policy is reset to SCHED_OTHER in child processes.

       *  If  the calling process has a negative nice value, the nice value is
          reset to zero in child processes.

       After the reset-on-fork flag has been enabled, it can be reset only  if
       the  thread  has the CAP_SYS_NICE capability.  This flag is disabled in
       child processes created by fork(2).

   Privileges and resource limits
       In Linux kernels before 2.6.12, only privileged (CAP_SYS_NICE)  threads
       can  set  a  nonzero  static priority (i.e., set a real-time scheduling
       policy).  The only change that an unprivileged thread can  make  is  to
       set  the SCHED_OTHER policy, and this can be done only if the effective
       user ID of the caller matches the real or effective user ID of the tar-
       get  thread  (i.e.,  the thread specified by pid) whose policy is being

       A thread must be privileged (CAP_SYS_NICE) in order to set or modify  a
       SCHED_DEADLINE policy.

       Since  Linux 2.6.12, the RLIMIT_RTPRIO resource limit defines a ceiling
       on an unprivileged  thread's  static  priority  for  the  SCHED_RR  and
       SCHED_FIFO policies.  The rules for changing scheduling policy and pri-
       ority are as follows:

       *  If an unprivileged thread has a nonzero  RLIMIT_RTPRIO  soft  limit,
          then  it  can  change its scheduling policy and priority, subject to
          the restriction that the priority cannot be set to  a  value  higher
          than  the maximum of its current priority and its RLIMIT_RTPRIO soft

       *  If the RLIMIT_RTPRIO soft  limit  is  0,  then  the  only  permitted
          changes  are  to lower the priority, or to switch to a non-real-time

       *  Subject to the same rules, another unprivileged thread can also make
          these changes, as long as the effective user ID of the thread making
          the change matches the real or  effective  user  ID  of  the  target

       *  Special  rules  apply  for  the SCHED_IDLE policy.  In Linux kernels
          before 2.6.39, an unprivileged thread operating  under  this  policy
          cannot   change   its   policy,  regardless  of  the  value  of  its
          RLIMIT_RTPRIO resource limit.  In Linux  kernels  since  2.6.39,  an

       priority forever.  Prior to Linux 2.6.25, the only way of preventing  a
       runaway  real-time  process from freezing the system was to run (at the
       console) a shell scheduled under a  higher  static  priority  than  the
       tested  application.  This allows an emergency kill of tested real-time
       applications that do not block or terminate as expected.

       Since Linux 2.6.25, there are other techniques for dealing with runaway
       real-time  and  deadline  processes.   One  of  these  is  to  use  the
       RLIMIT_RTTIME resource limit to set a ceiling on the CPU  time  that  a
       real-time process may consume.  See getrlimit(2) for details.

       Since  version  2.6.25, Linux also provides two /proc files that can be
       used to reserve a certain amount of CPU time to be  used  by  non-real-
       time  processes.   Reserving  some CPU time in this fashion allows some
       CPU time to be allocated to (say) a root shell that can be used to kill
       a  runaway  process.   Both  of  these  files  specify  time  values in

              This file specifies a scheduling period that  is  equivalent  to
              100%  CPU bandwidth.  The value in this file can range from 1 to
              INT_MAX, giving an operating range of 1 microsecond to around 35
              minutes.   The  default  value in this file is 1,000,000 (1 sec-

              The value in this file specifies how much of the  "period"  time
              can be used by all real-time and deadline scheduled processes on
              the system.  The value  in  this  file  can  range  from  -1  to
              INT_MAX-1.   Specifying  -1  makes  the  runtime the same as the
              period; that is, no CPU time is set aside for non-real-time pro-
              cesses (which was the Linux behavior before kernel 2.6.25).  The
              default value in this file is 950,000  (0.95  seconds),  meaning
              that 5% of the CPU time is reserved for processes that don't run
              under a real-time or deadline scheduling policy.

   Response time
       A blocked high priority thread waiting for I/O has a  certain  response
       time  before  it  is  scheduled  again.   The  device driver writer can
       greatly reduce this response time by using a "slow interrupt" interrupt

       Child  processes  inherit the scheduling policy and parameters across a
       fork(2).  The scheduling policy and  parameters  are  preserved  across

       Memory  locking is usually needed for real-time processes to avoid pag-
       ing delays; this can be done with mlock(2) or mlockall(2).

       Originally, Standard Linux was intended as a general-purpose  operating
       system  being able to handle background processes, interactive applica-
       tions, and less demanding  real-time  applications  (applications  that
       achieve the best real-time performance.  These patches are named:


       and  can  be  downloaded  from  <

       Without the patches and prior to their full inclusion into the mainline
       kernel, the kernel  configuration  offers  only  the  three  preemption
       EMPT_DESKTOP which respectively  provide  no,  some,  and  considerable
       reduction of the worst-case scheduling latency.

       With  the  patches applied or after their full inclusion into the main-
       line  kernel,  the  additional  configuration  item   CONFIG_PREEMPT_RT
       becomes  available.   If  this is selected, Linux is transformed into a
       regular real-time operating system.  The FIFO and RR  scheduling  poli-
       cies  are  then used to run a thread with true real-time priority and a
       minimum worst-case scheduling latency.

       chrt(1), taskset(1), getpriority(2), mlock(2), mlockall(2), munlock(2),
       munlockall(2), nice(2), sched_get_priority_max(2),
       sched_get_priority_min(2), sched_getscheduler(2), sched_getaffinity(2),
       sched_getparam(2), sched_rr_get_interval(2), sched_setaffinity(2),
       sched_setscheduler(2), sched_setparam(2), sched_yield(2),
       setpriority(2), pthread_getaffinity_np(3), pthread_setaffinity_np(3),
       sched_getcpu(3), capabilities(7), cpuset(7)

       Programming for the real  world  -  POSIX.4  by  Bill  O.  Gallmeister,
       O'Reilly & Associates, Inc., ISBN 1-56592-074-0.

       The    Linux   kernel   source   files   Documentation/scheduler/sched-
       deadline.txt,               Documentation/scheduler/sched-rt-group.txt,
       Documentation/scheduler/sched-design-CFS.txt,                       and

       This page is part of release 4.04 of the Linux  man-pages  project.   A
       description  of  the project, information about reporting bugs, and the
       latest    version    of    this    page,    can     be     found     at

Linux                             2015-07-23                          SCHED(7)
Man Pages Copyright Respective Owners. Site Copyright (C) 1994 - 2019 Hurricane Electric. All Rights Reserved.