https://zhuanlan.zhihu.com/p/50101525
本文分析Rust标准库中的channel,channel(通道)作为线程间通信的一种方式被广泛使用。
Rust提供了多生产者单消费者的channel。我们重点关注多个生产者的情况。
它的实现方式非常有趣。我把它分为通道升级跟并发队列两部分。
本文描述通道升级
对于一个channel()调用,我们得到的(sender, receiver)是oneshot的,这一点从源码可以得到暗示:
#[stable(feature = "rust1", since = "1.0.0")] pub fn channel<T>() -> (Sender<T>, Receiver<T>) { let a = Arc::new(oneshot::Packet::new()); (Sender::new(Flavor::Oneshot(a.clone())), Receiver::new(Flavor::Oneshot(a))) }
这里至少有四个结构:
- oneshot::Packet:Packet,真正存放数据的地方。此处是单个数据(其他类型可能使用队列)
- Flavor::Oneshot。
- Sender/Receiver。
我们分别看下他们的数据结构源码,首先是oneshot::Packet,它位于mpsc/oneshot.rs:
pub struct Packet<T> { // Internal state of the chan/port pair (stores the blocked thread as well) state: AtomicUsize, // One-shot data slot location data: UnsafeCell<Option<T>>, // when used for the second time, a oneshot channel must be upgraded, and // this contains the slot for the upgrade upgrade: UnsafeCell<MyUpgrade<T>>, }
可以看出data是为一个数据准备的。upgrade字段用于通道升级。
另外还有其他类型的Packet,查看同一文件夹发现有shared::Packet/stream::Packet/sync::Packet,他们分别位于shared.rs/stream.rs/sync.rs中。我们重点关注shared::Packet:
pub struct Packet<T> { queue: mpsc::Queue<T>, cnt: AtomicIsize, // How many items are on this channel steals: UnsafeCell<isize>, // How many times has a port received without blocking? to_wake: AtomicUsize, // SignalToken for wake up // The number of channels which are currently using this packet. channels: AtomicUsize, // See the discussion in Port::drop and the channel send methods for what // these are used for port_dropped: AtomicBool, sender_drain: AtomicIsize, // this lock protects various portions of this implementation during // select() select_lock: Mutex<()>, }
清楚地看到queue字段,它用于存放数据。我们先不关注数据字段。
对于这四个类型的Packet,标准库提供了enun Flavor
enum Flavor<T> { Oneshot(Arc<oneshot::Packet<T>>), Stream(Arc<stream::Packet<T>>), Shared(Arc<shared::Packet<T>>), Sync(Arc<sync::Packet<T>>), }
而我们的Sender/Receiver对象则非常简单地通过存储Flavor
pub struct Sender<T> { inner: UnsafeCell<Flavor<T>>, } pub struct Receiver<T> { inner: UnsafeCell<Flavor<T>>, }
我们再看一下fn channel:
pub fn channel<T>() -> (Sender<T>, Receiver<T>) { let a = Arc::new(oneshot::Packet::new()); (Sender::new(Flavor::Oneshot(a.clone())), Receiver::new(Flavor::Oneshot(a))) }
就可以了解到Sender/Receiver里面都存了Flavor,根据Flavor的类型区分Packet的类型,同时Packet作为共享数据被安全地共享。
这就是我们调用channel得到的结果。因为我们重点关注多生产者的情况,所以我们再看一下Clone for Sender的实现:
impl<T> Clone for Sender<T> { fn clone(&self) -> Sender<T> { let packet = match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => { let a = Arc::new(shared::Packet::new()); { let guard = a.postinit_lock(); let rx = Receiver::new(Flavor::Shared(a.clone())); let sleeper = match p.upgrade(rx) { oneshot::UpSuccess | oneshot::UpDisconnected => None, oneshot::UpWoke(task) => Some(task), }; a.inherit_blocker(sleeper, guard); } a } Flavor::Stream(ref p) => { let a = Arc::new(shared::Packet::new()); { let guard = a.postinit_lock(); let rx = Receiver::new(Flavor::Shared(a.clone())); let sleeper = match p.upgrade(rx) { stream::UpSuccess | stream::UpDisconnected => None, stream::UpWoke(task) => Some(task), }; a.inherit_blocker(sleeper, guard); } a } Flavor::Shared(ref p) => { p.clone_chan(); return Sender::new(Flavor::Shared(p.clone())); } Flavor::Sync(..) => unreachable!(), }; unsafe { let tmp = Sender::new(Flavor::Shared(packet.clone())); mem::swap(self.inner_mut(), tmp.inner_mut()); } Sender::new(Flavor::Shared(packet)) } }
代码比较多,但我们关注Flavor::Oneshot的情况,先看下self.inner()的实现,它是通过 trait UnsafeFlavor来提供的接口:
trait UnsafeFlavor<T> { fn inner_unsafe(&self) -> &UnsafeCell<Flavor<T>>; unsafe fn inner_mut(&self) -> &mut Flavor<T> { &mut *self.inner_unsafe().get() } unsafe fn inner(&self) -> &Flavor<T> { &*self.inner_unsafe().get() } } impl<T> UnsafeFlavor<T> for Sender<T> { fn inner_unsafe(&self) -> &UnsafeCell<Flavor<T>> { &self.inner } }
考虑到Sender存了inner: UnsafeCell<Flavor
impl<T> Clone for Sender<T> { fn clone(&self) -> Sender<T> { let packet = match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => { let a = Arc::new(shared::Packet::new()); { let guard = a.postinit_lock(); let rx = Receiver::new(Flavor::Shared(a.clone())); let sleeper = match p.upgrade(rx) { oneshot::UpSuccess | oneshot::UpDisconnected => None, oneshot::UpWoke(task) => Some(task), }; a.inherit_blocker(sleeper, guard); } a } ............ }; unsafe { let tmp = Sender::new(Flavor::Shared(packet.clone())); mem::swap(self.inner_mut(), tmp.inner_mut()); } Sender::new(Flavor::Shared(packet)) } }
接下来通过Arc::new(shared::Packet::new()),创建了一个全新的shared::Packet,a。
然后调用a.postinit_lock(),我们看下它的代码:
pub fn postinit_lock(&self) -> MutexGuard<()> { self.select_lock.lock().unwrap() }
结合Shared::Packet的new函数:
pub fn new() -> Packet<T> { Packet { queue: mpsc::Queue::new(), cnt: AtomicIsize::new(0), steals: UnsafeCell::new(0), to_wake: AtomicUsize::new(0), channels: AtomicUsize::new(2), port_dropped: AtomicBool::new(false), sender_drain: AtomicIsize::new(0), select_lock: Mutex::new(()), } }
发现它只是个lock操作,guard作为返回的对象将来用于解锁。
我们接着看原来的代码,这一行是重点:
let rx = Receiver::new(Flavor::Shared(a.clone()));
我们根据新建的a,创建了一个Receiver rx,这里创建的rx是挺奇怪的事情。但是我们只能接着看代码:
let sleeper = match p.upgrade(rx) { oneshot::UpSuccess | oneshot::UpDisconnected => None, oneshot::UpWoke(task) => Some(task), };
这里的p就是原来的oneshot::Packet,传入新建的rx,我们调用它的upgrade方法:
pub fn upgrade(&self, up: Receiver<T>) -> UpgradeResult { unsafe { let prev = match *self.upgrade.get() { NothingSent => NothingSent, SendUsed => SendUsed, _ => panic!("upgrading again"), }; ptr::write(self.upgrade.get(), GoUp(up)); match self.state.swap(DISCONNECTED, Ordering::SeqCst) { // If the channel is empty or has data on it, then we're good to go. // Senders will check the data before the upgrade (in case we // plastered over the DATA state). DATA | EMPTY => UpSuccess, // If the other end is already disconnected, then we failed the // upgrade. Be sure to trash the port we were given. DISCONNECTED => { ptr::replace(self.upgrade.get(), prev); UpDisconnected } // If someone's waiting, we gotta wake them up ptr => UpWoke(SignalToken::cast_from_usize(ptr)) } } }
根据初始化的upgrade字段的值,我们发现只能是NothingSent:
pub fn new() -> Packet<T> { Packet { data: UnsafeCell::new(None), upgrade: UnsafeCell::new(NothingSent), state: AtomicUsize::new(EMPTY), } }
然后我们把GoUp(up)写入了upgrade字段,那么现在我们新建的rx:Receiver也就到了upgrade字段里面,这里我们可以看下GoUp字段相关的代码:
enum MyUpgrade<T> { NothingSent, SendUsed, GoUp(Receiver<T>), }
接着将通过self.state.swap操作将状态改变为DISCONNECTED,因为这个oneshot::Packet将要被淘汰,而我们只是把它的状态从EMPTY变为DISCONNECTED,可以看下相关的代码:
// Various states you can find a port in.
const EMPTY: usize = 0; // initial state: no data, no blocked receiver const DATA: usize = 1; // data ready for receiver to take const DISCONNECTED: usize = 2; // channel is disconnected OR upgraded
最后upgrade返回作为结果UpgradeResult 的UpSuccess标记。我们接着看原来clone的代码:
impl<T> Clone for Sender<T> { fn clone(&self) -> Sender<T> { let packet = match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => { let a = Arc::new(shared::Packet::new()); { let guard = a.postinit_lock(); let rx = Receiver::new(Flavor::Shared(a.clone())); let sleeper = match p.upgrade(rx) { oneshot::UpSuccess | oneshot::UpDisconnected => None, oneshot::UpWoke(task) => Some(task), }; a.inherit_blocker(sleeper, guard); } a } ............ }; .................. } }
这里的p.upgrade(rx)的结果就是UpSuccess,那么sleeper 就是None。
我们接着看a.inherit_blocker(sleeper, guard)的实现:
pub fn inherit_blocker(&self, token: Option<SignalToken>, guard: MutexGuard<()>) { token.map(|token| { assert_eq!(self.cnt.load(Ordering::SeqCst), 0); assert_eq!(self.to_wake.load(Ordering::SeqCst), 0); self.to_wake.store(unsafe { token.cast_to_usize() }, Ordering::SeqCst); self.cnt.store(-1, Ordering::SeqCst); unsafe { *self.steals.get() = -1; } }); drop(guard); }
被传入的token也就是sleeper为None,None.map(||{})只是返回None,所以这里的操作只是通过guard释放了锁。到此,我们返回a,就是packet:Arc<shared::Packet
impl<T> Clone for Sender<T> { fn clone(&self) -> Sender<T> { let packet = match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => { let a = Arc::new(shared::Packet::new()); { let guard = a.postinit_lock(); let rx = Receiver::new(Flavor::Shared(a.clone())); let sleeper = match p.upgrade(rx) { oneshot::UpSuccess | oneshot::UpDisconnected => None, oneshot::UpWoke(task) => Some(task), }; a.inherit_blocker(sleeper, guard); } a } ............ }; unsafe { let tmp = Sender::new(Flavor::Shared(packet.clone())); mem::swap(self.inner_mut(), tmp.inner_mut()); } Sender::new(Flavor::Shared(packet)) } }
注意,我们通过Sender::new(Flavor::Shared(packet))返回了一个新的Sender对象,它基于shared::Packet。同时,我们构造了一个临时的Sender对象tmp,然后通过mem::swap这种unsafe的内存操作,将当前的对象内部的inner替换掉,注意它是UnsafeCell<Flavor
Flavor::Oneshot(Arc<oneshot::Packet
>)
=> Flavor::Shared(Arc<shared::Packet>)
而这个tmp对象,我们看下它的drop方法,由于swap操作,走Flavor::OneShot路径:
impl<T> Drop for Sender<T> { fn drop(&mut self) { match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => p.drop_chan(), Flavor::Stream(ref p) => p.drop_chan(), Flavor::Shared(ref p) => p.drop_chan(), Flavor::Sync(..) => unreachable!(), } } } pub fn drop_chan(&self) { match self.state.swap(DISCONNECTED, Ordering::SeqCst) { DATA | DISCONNECTED | EMPTY => {} // If someone's waiting, we gotta wake them up ptr => unsafe { SignalToken::cast_from_usize(ptr).signal(); } } }
self.state字段已经是DISCONNECTED的值了,所以tmp被析构时不会有更多的操作。
以上是针对Flavor::Oneshot的clone实现,我们再看下如果接着调用clone的实现:
fn clone(&self) -> Sender<T> { let packet = match *unsafe { self.inner() } { ............ Flavor::Shared(ref p) => { p.clone_chan(); return Sender::new(Flavor::Shared(p.clone())); } Flavor::Sync(..) => unreachable!(), }; ............ }
注意到它只会走Flavor::Shared的路径,只返回一个新的Sender<Flavor::Shared<..>>而已
我们看下clone_chan的实现:
pub fn clone_chan(&self) { let old_count = self.channels.fetch_add(1, Ordering::SeqCst); // See comments on Arc::clone() on why we do this (for `mem::forget`). if old_count > MAX_REFCOUNT { unsafe { abort(); } } }
只是增加了一个关联管道的计数。
综合以上,我们现在有两个Sender:
- 一个是一开始的Sender,也就是代码中的self,它内部的inner已经指向Flavor::Shared。
- 另一个是clone出来的Sender,它一样是指向Flavor::Shared,并且与第一个共享一个shared::Packet。
同时我们还有两个Receiver:
- 一个是一开始的Receiver,它内部的inner现在还是指向一开始的Flavor::Oneshot,里面包裹了初始的oneshot::Packet。
- 另一个是Sender.clone()调用中创建的Receiver,它指向了Flavor::Shared。同时它被存放在了初始的oneshot::Packet里面。
也就是说通过第一个Receiver可得到oneshot::Packet,通过它可以得到Flavor::Shared,那么我们就可以成功实现Receiver的升级操作。
但是此刻当Sender的所有clone操作都完成时,Receiver是还没升级的。为了查看Receiver何时升级,我们来看Receiver的recv函数:
pub fn recv(&self) -> Result<T, RecvError> { loop { let new_port = match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => { match p.recv(None) { Ok(t) => return Ok(t), Err(oneshot::Disconnected) => return Err(RecvError), Err(oneshot::Upgraded(rx)) => rx, Err(oneshot::Empty) => unreachable!(), } } Flavor::Stream(ref p) => { match p.recv(None) { Ok(t) => return Ok(t), Err(stream::Disconnected) => return Err(RecvError), Err(stream::Upgraded(rx)) => rx, Err(stream::Empty) => unreachable!(), } } Flavor::Shared(ref p) => { match p.recv(None) { Ok(t) => return Ok(t), Err(shared::Disconnected) => return Err(RecvError), Err(shared::Empty) => unreachable!(), } } Flavor::Sync(ref p) => return p.recv(None).map_err(|_| RecvError), }; unsafe { mem::swap(self.inner_mut(), new_port.inner_mut()); } } }
我们只关注Flavor::Oneshot的情况,得到内部的oneshot::Packet为p,调用p.recv(None):
pub fn recv(&self, deadline: Option<Instant>) -> Result<T, Failure<T>> { // Attempt to not block the thread (it's a little expensive). If it looks // like we're not empty, then immediately go through to `try_recv`. if self.state.load(Ordering::SeqCst) == EMPTY { let (wait_token, signal_token) = blocking::tokens(); let ptr = unsafe { signal_token.cast_to_usize() }; // race with senders to enter the blocking state if self.state.compare_and_swap(EMPTY, ptr, Ordering::SeqCst) == EMPTY { if let Some(deadline) = deadline { let timed_out = !wait_token.wait_max_until(deadline); // Try to reset the state if timed_out { self.abort_selection().map_err(Upgraded)?; } } else { wait_token.wait(); debug_assert!(self.state.load(Ordering::SeqCst) != EMPTY); } } else { // drop the signal token, since we never blocked drop(unsafe { SignalToken::cast_from_usize(ptr) }); } } self.try_recv() }
此刻,由于之前Sender.clone()操作,这里的self.state已经是DISCONNECTED了,所以我们接着看self.try_recv():
pub fn try_recv(&self) -> Result<T, Failure<T>> { unsafe { match self.state.load(Ordering::SeqCst) { EMPTY => Err(Empty), DATA => { self.state.compare_and_swap(DATA, EMPTY, Ordering::SeqCst); match (&mut *self.data.get()).take() { Some(data) => Ok(data), None => unreachable!(), } } DISCONNECTED => { match (&mut *self.data.get()).take() { Some(data) => Ok(data), None => { match ptr::replace(self.upgrade.get(), SendUsed) { SendUsed | NothingSent => Err(Disconnected), GoUp(upgrade) => Err(Upgraded(upgrade)) } } } } // We are the sole receiver; there cannot be a blocking // receiver already. _ => unreachable!() } } }
显然,这里走的是DISCONNECTED 路径,self.data初始值为None,所以这里的take()操作走None路径,关键是下面的代码:
None => { match ptr::replace(self.upgrade.get(), SendUsed) { SendUsed | NothingSent => Err(Disconnected), GoUp(upgrade) => Err(Upgraded(upgrade)) } }
我们把self.upgrade里面存放的数据替换为SendUsed,同时取得原来的数据。
注意,这里取得的数据GoUp(upgrade),upgrade就是之前我们不知道为何创建的Receiver
pub enum Failure<T> { Empty, Disconnected, Upgraded(Receiver<T>), }
这个值一直返回到Receiver.recv()操作里面,
pub fn recv(&self) -> Result<T, RecvError> { loop { let new_port = match *unsafe { self.inner() } { Flavor::Oneshot(ref p) => { match p.recv(None) { Ok(t) => return Ok(t), Err(oneshot::Disconnected) => return Err(RecvError), Err(oneshot::Upgraded(rx)) => rx, Err(oneshot::Empty) => unreachable!(), } } ............ }; unsafe { mem::swap(self.inner_mut(), new_port.inner_mut()); } } }
根据Err(oneshot::Upgraded(rx))匹配得到rx,也就是创建的那个Receiver。接着rx作为new_port,最后通过一样的mem::swap操作把Receiver内部的Flavor
于是,我们看到Receiver已经成功升级为关联到Flavor::Shared<shared::Packet
至此,Sender/Receiver从仅存放一个元素的通道升级为无限制容量的MPSC通道。