原文出处:https://bohutang.me/2020/06/08/clickhouse-and-friends-mysql-protocol-write-stack/
上篇的MySQL Protocol和Read调用里介绍了 ClickHouse 一条查询语句的调用栈,本文继续介绍写的调用栈,开整。
Write请求
建表:
mysql> CREATE TABLE test(a UInt8, b UInt8, c UInt8) ENGINE=MergeTree() PARTITION BY (a, b) ORDER BY c;Query OK, 0 rows affected (0.03 sec)
写入数据:
INSERT INTO test VALUES(1,1,1), (2,2,2);
调用栈分析
1. 获取存储引擎 OutputStream
DB::StorageMergeTree::write(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&) StorageMergeTree.cpp:174DB::PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(std::__1::shared_ptr<DB::IStorage> const&, DB::Context const&, std::__1::shared_ptr<DB::IAST> const&, bool) PushingToViewsBlockOutputStream.cpp:110DB::InterpreterInsertQuery::execute() InterpreterInsertQuery.cpp:229DB::executeQueryImpl(const char *, const char *, DB::Context &, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer *) executeQuery.cpp:364DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) executeQuery.cpp:696DB::MySQLHandler::comQuery(DB::ReadBuffer&) MySQLHandler.cpp:311DB::MySQLHandler::run() MySQLHandler.cpp:141
2. 从 SQL 组装 InputStream
(1,1,1), (2,2,2)
如何组装成 inputstream 结构呢?
DB::InputStreamFromASTInsertQuery::InputStreamFromASTInsertQuery(std::__1::shared_ptr<DB::IAST> const&, DB::ReadBuffer*, DB::InterpreterInsertQuery::execute() InterpreterInsertQuery.cpp:300DB::executeQueryImpl(char const*, char const*, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer*) executeQuery.cpp:386DB::MySQLHandler::comQuery(DB::ReadBuffer&) MySQLHandler.cpp:313DB::MySQLHandler::run() MySQLHandler.cpp:150
然后
res.in = std::make_shared<InputStreamFromASTInsertQuery>(query_ptr, nullptr, query_sample_block, context, nullptr);res.in = std::make_shared<NullAndDoCopyBlockInputStream>(res.in, out_streams.at(0));
通过 NullAndDoCopyBlockInputStream的 copyData 方法构造出 Block:
DB::ValuesBlockInputFormat::readRow(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, unsigned long) ValuesBlockInputFormat.cpp:93DB::ValuesBlockInputFormat::generate() ValuesBlockInputFormat.cpp:55DB::ISource::work() ISource.cpp:48DB::InputStreamFromInputFormat::readImpl() InputStreamFromInputFormat.h:48DB::IBlockInputStream::read() IBlockInputStream.cpp:57DB::InputStreamFromASTInsertQuery::readImpl() InputStreamFromASTInsertQuery.h:31DB::IBlockInputStream::read() IBlockInputStream.cpp:57void DB::copyDataImpl<DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::atomic<bool>*)::$_0&, void (&)(DB::Block const&)>(DB::IBlockInputStream&, DB::IBlockOutputStream&, DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::atomic<bool>*)::$_0&, void (&)(DB::Block const&)) copyData.cpp:26DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::atomic<bool>*) copyData.cpp:62DB::NullAndDoCopyBlockInputStream::readImpl() NullAndDoCopyBlockInputStream.h:47DB::IBlockInputStream::read() IBlockInputStream.cpp:57void DB::copyDataImpl<std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&>(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) copyData.cpp:26DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) copyData.cpp:73DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) executeQuery.cpp:785DB::MySQLHandler::comQuery(DB::ReadBuffer&) MySQLHandler.cpp:313DB::MySQLHandler::run() MySQLHandler.cpp:150
3. 组装 OutputStream
DB::InterpreterInsertQuery::execute() InterpreterInsertQuery.cpp:107DB::executeQueryImpl(const char *, const char *, DB::Context &, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer *) executeQuery.cpp:364DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) executeQuery.cpp:696DB::MySQLHandler::comQuery(DB::ReadBuffer&) MySQLHandler.cpp:311DB::MySQLHandler::run() MySQLHandler.cpp:141
组装顺序:
NullAndDoCopyBlockInputStream
CountingBlockOutputStream
AddingDefaultBlockOutputStream
SquashingBlockOutputStream
PushingToViewsBlockOutputStream
MergeTreeBlockOutputStream
4. 写入OutputStream
DB::MergeTreeBlockOutputStream::write(DB::Block const&) MergeTreeBlockOutputStream.cpp:17DB::PushingToViewsBlockOutputStream::write(DB::Block const&) PushingToViewsBlockOutputStream.cpp:145DB::SquashingBlockOutputStream::finalize() SquashingBlockOutputStream.cpp:30DB::SquashingBlockOutputStream::writeSuffix() SquashingBlockOutputStream.cpp:50DB::AddingDefaultBlockOutputStream::writeSuffix() AddingDefaultBlockOutputStream.cpp:25DB::CountingBlockOutputStream::writeSuffix() CountingBlockOutputStream.h:37DB::copyDataImpl<DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::atomic<bool>*)::<lambda()>&, void (&)(const DB::Block&)>(DB::IBlockInputStream &, DB::IBlockOutputStream &, <lambda()> &, void (&)(const DB::Block &)) copyData.cpp:52DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::atomic<bool>*) copyData.cpp:138DB::NullAndDoCopyBlockInputStream::readImpl() NullAndDoCopyBlockInputStream.h:57DB::IBlockInputStream::read() IBlockInputStream.cpp:60void DB::copyDataImpl<std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&>(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) copyData.cpp:29DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) copyData.cpp:154DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) executeQuery.cpp:748DB::MySQLHandler::comQuery(DB::ReadBuffer&) MySQLHandler.cpp:311DB::MySQLHandler::run() MySQLHandler.cpp:141
通过 copyData 方法,让数据在 OutputStream 间层层透传,一直到 MergeTreeBlockOutputStream。
5. 返回 Client
DB::MySQLOutputFormat::finalize() MySQLOutputFormat.cpp:62DB::IOutputFormat::doWriteSuffix() IOutputFormat.h:78DB::OutputStreamToOutputFormat::writeSuffix() OutputStreamToOutputFormat.cpp:18DB::MaterializingBlockOutputStream::writeSuffix() MaterializingBlockOutputStream.h:22void DB::copyDataImpl<std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&>(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) copyData.cpp:52DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) copyData.cpp:154DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) executeQuery.cpp:748DB::MySQLHandler::comQuery(DB::ReadBuffer&) MySQLHandler.cpp:311DB::MySQLHandler::run() MySQLHandler.cpp:141
总结
INSERT INTO test VALUES(1,1,1), (2,2,2);
首先内核解析 SQL 语句生成 AST,根据 AST 获取 Interpreter:InterpreterInsertQuery。其次 Interpreter 依次添加相应的 OutputStream。然后从 InputStream 读取数据,写入到 OutputStream,stream 会层层渗透,一直写到底层的存储引擎。最后写入到 Socket Output,返回结果。
ClickHouse 的 OutputStream 编排还是比较复杂,缺少类似 Pipeline 的调度和编排,但是由于模式比较固化,目前看还算清晰。
文内链接
全文完。
Enjoy ClickHouse :)
叶老师的「MySQL核心优化」大课已升级到MySQL 8.0,扫码开启MySQL 8.0修行之旅吧
本文分享自微信公众号 - 老叶茶馆(iMySQL_WX)。
如有侵权,请联系 support@oschina.cn 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。