Linux RCU (Read-Copy Update)

my

读者在访问被RCU保护的共享数据期间不能被阻塞,这是RCU机制得以实现的一个基本前提,也就说
当读者在引用被RCU保护的共享数据期间,读者所在的CPU不能发生上下文切换(不能主动阻塞(block)，
如等待某个资源；也不能被抢占(preempt)), spinlock和rw lock都
需要这样的前提。写者在访问被RCU保护的共享数据时不需要和读者竞争任何锁,只有在有多于一个
写者的情况下需要获得某种锁以与其他写者同步。写者修改数据前首先拷贝一个被修改元素的副本,
然后在副本上进行修改,修改完毕后它向垃圾回收器注册一个回调函数以便在适当的时机执行真正的
修改操作。等待适当时机的这一时期称为grace period,而CPU发生了上下文切换称为经历一个
quiescent state,grace period就是所有CPU都经历一次quiescent state所需要的等待的时间。垃圾收
集器就是在grace period之后调用写者注册的回调函数来完成真正的数据修改或数据释放操作的。

但如果打过 PREEMPT_RT 补丁，或是配置了 CONFIG_PREEMPT_RCU，read-side的critical sections
可以被抢占，因此quiescent state的确认还需要其他辅助机制，例如reference counts。

参考
Linux 2.6内核中新的锁机制–RCU http://www.ibm.com/developerworks/cn/linux/l-rcu/
call_rcu()函数解析 http://blog.chinaunix.net/u1/55599/showart_1101879.html

my

For more information, see http://www.rdrop.com/users/paulmck/RCU

linux-2.6.24/Documentation/RCU/whatisRCU.txt

1.    RCU OVERVIEW
2.    WHAT IS RCU’S CORE API?
3.    WHAT ARE SOME EXAMPLE USES OF CORE RCU API?
4.    WHAT IF MY UPDATING THREAD CANNOT BLOCK?
5.    WHAT ARE SOME SIMPLE IMPLEMENTATIONS OF RCU?
6.    ANALOGY WITH READER-WRITER LOCKING
7.    FULL LIST OF RCU APIs
8.    ANSWERS TO QUICK QUIZZES

1. RCU OVERVIEW

The basic idea behind RCU is to split updates into "removal" and
"reclamation" phases. The removal phase removes references to data items
within a data structure (possibly by replacing them with references to
new versions of these data items), and can run concurrently with readers.
The reason that it is safe to run the removal phase concurrently with
readers is the semantics of modern CPUs guarantee that readers will see
either the old or the new version of the data structure rather than a
partially updated reference. The reclamation phase does the work of reclaiming
(e.g., freeing) the data items removed from the data structure during the
removal phase. Because reclaiming data items can disrupt any readers
concurrently referencing those data items, the reclamation phase must
not start until readers no longer hold references to those data items.

Splitting the update into removal and reclamation phases permits the
updater to perform the removal phase immediately, and to defer the
reclamation phase until all readers active during the removal phase have
completed, either by blocking until they finish or by registering a
callback that is invoked after they finish. Only readers that are active
during the removal phase need be considered, because any reader starting
after the removal phase will be unable to gain a reference to the removed
data items, and therefore cannot be disrupted by the reclamation phase.

So the typical RCU update sequence goes something like the following:

a.    Remove pointers to a data structure, so that subsequent
    readers cannot gain a reference to it.

b.    Wait for all previous readers to complete their RCU read-side
    critical sections.

c.    At this point, there cannot be any readers who hold references
    to the data structure, so it now may safely be reclaimed
    (e.g., kfree()d).

Step (b) above is the key idea underlying RCU’s deferred destruction.
The ability to wait until all readers are done allows RCU readers to
use much lighter-weight synchronization, in some cases, absolutely no
synchronization at all. In contrast, in more conventional lock-based
schemes, readers must use heavy-weight synchronization in order to
prevent an updater from deleting the data structure out from under them.
This is because lock-based updaters typically update data items in place,
and must therefore exclude readers. In contrast, RCU-based updaters
typically take advantage of the fact that writes to single aligned
pointers are atomic on modern CPUs, allowing atomic insertion, removal,
and replacement of data items in a linked structure without disrupting
readers. Concurrent RCU readers can then continue accessing the old
versions, and can dispense with the atomic operations, memory barriers,
and communications cache misses that are so expensive on present-day
SMP computer systems, even in absence of lock contention.

In the three-step procedure shown above, the updater is performing both
the removal and the reclamation step, but it is often helpful for an
entirely different thread to do the reclamation, as is in fact the case
in the Linux kernel’s directory-entry cache (dcache). Even if the same
thread performs both the update step (step (a) above) and the reclamation
step (step (c) above), it is often helpful to think of them separately.
For example, RCU readers and updaters need not communicate at all,
but RCU provides implicit low-overhead communication between readers
and reclaimers, namely, in step (b) above.

2. WHAT IS RCU’S CORE API?

The core RCU API is quite small:

a.      rcu_read_lock()
b.      rcu_read_unlock()
c.      synchronize_rcu() / call_rcu()
d.      rcu_assign_pointer()
e.      rcu_dereference()

There are many other members of the RCU API, but the rest can be
expressed in terms of these five, though most implementations instead
express synchronize_rcu() in terms of the call_rcu() callback API.

The five core RCU APIs are described below, the other 18 will be enumerated
later. See the kernel docbook documentation for more info, or look directly
at the function header comments.

rcu_read_lock()

        void rcu_read_lock(void);

        Used by a reader to inform the reclaimer that the reader is
        entering an RCU read-side critical section. It is illegal
        to block while in an RCU read-side critical section, though
        kernels built with CONFIG_PREEMPT_RCU can preempt RCU read-side
        critical sections. Any RCU-protected data structure accessed
        during an RCU read-side critical section is guaranteed to remain
        unreclaimed for the full duration of that critical section.
        Reference counts may be used in conjunction with RCU to maintain
        longer-term references to data structures.

==============================================================

linux-2.6.24/Documentation/RCU/UP.txt

        What locking restriction must RCU callbacks respect?

        Any lock that is acquired within an RCU callback must be
        acquired elsewhere using an _irq variant of the spinlock
        primitive. For example, if "mylock" is acquired by an
        RCU callback, then a process-context acquisition of this
        lock must use something like spin_lock_irqsave() to
        acquire the lock. (<by jfo> i.e. to disable interrupt)

        If the process-context code were to simply use spin_lock(),
        then, since RCU callbacks can be invoked from softirq context,
        the callback might be called from a softirq that interrupted
        the process-context critical section. This would result in
        self-deadlock.

        This restriction might seem gratuitous, since very few RCU
        callbacks acquire locks directly. However, a great many RCU
        callbacks do acquire locks -indirectly-, for example, via
        the kfree() primitive. (<by jfo> in kmalloc() and kfree(), local CPU’s interrupt are all disabled!)

~~end~~