ClickHouse系列1--入门

1.配置文件

服务器配置文件位于/etc/clickhouse-server/config.xml中的<path>元素,它决定了数据存储的位置,因此它应该位于磁盘容量的卷上;默认值是/var/lib/clickhouse/。如果你想调整配置,直接编辑config是不方便的。考虑到它可能会在将来的包更新中被重写。建议重写配置元素的方法是在配置中创建config.d文件夹,作为config.xml的重写方式。

2.启动

clickhouse-server安装后不会自动启动,可以如下方式:

sudo service clickhouse-server start
sudo /etc/init.d/clickhouse-server start

服务端日志的默认位置是/var/log/clickhouse-server/。当服务端在日志中记录Ready for connections消息,即表示服务端已准备好处理客户端连接。

3.客户端连接

交互模式:

clickhouse-client
clickhouse-client --host=... --port=... --user=... --password=...

启用多行查询:

clickhouse-client -m
clickhouse-client --multiline

以批处理模式运行查询:

clickhouse-client --query='SELECT 1'
echo 'SELECT 1' | clickhouse-client
clickhouse-client <<< 'SELECT 1'

从指定格式的文件中插入数据:

clickhouse-client --query='INSERT INTO table VALUES' < data.txt
clickhouse-client --query='INSERT INTO table FORMAT TabSeparated' < data.tsv

4.创建数据库

与大多数数据库管理系统一样,ClickHouse在逻辑上将表分组为数据库。包含一个default数据库,但我们可以创建自己的数据库tutorial:

clickhouse-client --query "CREATE DATABASE IF NOT EXISTS tutorial"

5.创建表

与创建数据库相比,创建表的语法要复杂得多。一般CREATE TABLE声明必须指定三个关键的事情:

  1. 要创建的表的名称;

  2. 表结构,例如:列名和对应的数据类型;

  3. 表引擎及其设置。这决定了对此表的查询操作是如何在物理层面执行的所有细节。

    CREATE TABLE tutorial.hits_v1
    (
        `WatchID` UInt64,
        `JavaEnable` UInt8,
        `Title` String,
        `GoodEvent` Int16,
        `EventTime` DateTime,
        `EventDate` Date,
        `CounterID` UInt32,
        `ClientIP` UInt32,
        `ClientIP6` FixedString(16),
        `RegionID` UInt32,
        `UserID` UInt64,
        `URLHash` UInt64,
        `CLID` UInt32,
        `YCLID` UInt64,
        `ShareService` String,
        `ShareURL` String,
        `ShareTitle` String,
        `ParsedParams` Nested(
            Key1 String,
            Key2 String,
            Key3 String,
            Key4 String,
            Key5 String,
            ValueDouble Float64),
        `IslandID` FixedString(16),
        `RequestNum` UInt32,
        `RequestTry` UInt8
    )
    ENGINE = MergeTree()
    PARTITION BY toYYYYMM(EventDate)
    ORDER BY (CounterID, EventDate, intHash32(UserID))
    SAMPLE BY intHash32(UserID)
    

6.查询配置项的含义

我们可能需要调整一些参数,如倒入数据时,设置最大的插入条数:

clickhouse-client --query "INSERT INTO tutorial.hits_v1 FORMAT TSV" --max_insert_block_size=100000 < hits_v1.tsv

找出可用的设置、含义及其默认值的最简单方法是查询system.settings 表:

SELECT name, value, changed, description
FROM system.settings
WHERE name LIKE '%max_insert_b%'
FORMAT TSV

max_insert_block_size    1048576    0    "The maximum block size for insertion, if we control the creation of blocks for insertion."

7.查询示例

clickhouse的查询还是比较像mysql的,例如:

SELECT
    StartURL AS URL,
    AVG(Duration) AS AvgDuration
FROM tutorial.visits_v1
WHERE StartDate BETWEEN '2014-03-23' AND '2014-03-30'
GROUP BY URL
ORDER BY AvgDuration DESC
LIMIT 10
SELECT
    sum(Sign) AS visits,
    sumIf(Sign, has(Goals.ID, 1105530)) AS goal_visits,
    (100. * goal_visits) / visits AS goal_percent
FROM tutorial.visits_v1
WHERE (CounterID = 912887) AND (toYYYYMM(StartDate) = 201403) AND (domain(StartURL) = 'yandex.ru')
相关推荐
©️2020 CSDN 皮肤主题: 鲸 设计师:meimeiellie 返回首页