Apache Sentry实战之旅(一)—— Impala+Sentry整合

Stella981
• 阅读 1395

Impala默认是以impala这个超级用户运行服务,执行DMLDDL操作的,要实现不同用户之间细粒度的权限控制,需要与Sentry整合。SentryApache下的一个开源项目,它基于RBAC的授权模型实现了权限控制,Impala与它整合以后,就能实现不同用户之间在应用层的权限认证,从而控制用户的DMLDDLDCL操作权限。Sentry为确保数据安全,提供了一个统一平台,可以使用现有的Hadoop Kerberos实现安全认证,同时,通过HiveImpala访问数据时可以使用同样的Sentry协议。本文会对Sentry进行简单的介绍并演示Impala+Sentry整合后的实际效果。

Sentry介绍

Apache SentryCloudera公司发布的一个用于权限控制的Hadoop开源组件,已于2016年3月顺利从孵化器毕业,成为Apache顶级项目。它基于RBAC的授权模型实现了细粒度的权限控制,Sentry目前可以与Apache HiveHive Metastore/HCatalogApache SolrImpalaHDFS(仅限于Hive表数据)整合实现权限控制。以下是Sentry以及它与Hadoop其他组件整合的一张概览图:

Apache Sentry实战之旅(一)—— Impala+Sentry整合

这张概览图的成员可以按身份分为两部分:

1、Sentry服务组件:

  • Sentry Server: 服务提供层。它基于RPC协议实现,主要负责管理权限数据,提供了安全的查询和保存元数据的RPC接口
  • Data Engine:数据引擎层。它有两个职责:一是负责加载Sentry插件,二是拦截所有访问资源的客户端(如HiveImpala)请求,并转发到Sentry Plugin中进行权限验证
  • Sentry Plugin:权限认证层。这是Sentry授权的核心组件,负责判定从数据处理层获取的权限信息与服务提供层已保存的权限信息是否匹配
  • Policy Metadata:数据存储层。负责权限数据的存储,Sentry支持使用ini文件和关系型DB来存储权限数据。当使用ini文件时,这个文件可以存在于本地路径或者HDFS中,基于文件的方式在使用程序修改过程中会存在资源竞争,不利于维护;当使用关系型DB时,Sentry将权限信息持久化到DB中,并为应用层提供API接口方便创建、查询、更新和删除这些数据。Sentry可以使用很多后端的数据库,例如MySQLPostgres等等,它使用ORMDataNucleus来完成持久化操作。

2、Sentry使用者组件:

Impala、Hive、Solr为代表的各个组件组成了Sentry使用者组件,在Sentry中,这些组件都是以客户端的身份调用Sentry服务的。

简单地讲,Sentry是用一种类似C/S架构的方式来向外提供服务,所有使用Sentry的组件都可以被视为一个Sentry客户端,使用RPC协议来与Sentry Server端交互。使用了Sentry之后,这些客户端grant/revoke管理的权限完全被Sentry接管,grant/revoke的执行也完全在Sentry中实现。对于所有引擎的授权信息也存储在由Sentry设定的统一的数据库中,这样所有引擎的权限就实现了集中管理。

Sentry授权包括以下几种角色:

  • 资源。可以是Server、Database、Table或者URL(例如:HDFS或者本地路径)。Sentry1.5中支持对列进行授权
  • 权限。授权访问某一个资源的规则
  • 角色。角色是一系列权限的集合
  • 用户和组。一个组是一系列用户的集合。Sentry 的组映射是可以扩展的。默认情况下,Sentry使用Hadoop的组映射(可以是操作系统组或者LDAP中的组)。Sentry允许你将用户和组进行关联,你可以将一系列的用户放入到一个组中。Sentry不能直接给一个用户或组授权,需要将权限授权给角色,角色可以授权给一个组而不是一个用户

安装Sentry Server

环境

Sentry版本:1.5.1-cdh5.16.1

JDK版本:jdk1.8.0_212

Maven版本:apache-maven-3.6.1

Impala版本:2.12.0-cdh5.16.1

Hadoop版本:hadoop-2.6.0-cdh5.16.1

编译安装Sentry Server

接下来使用maven编译生成Sentry安装包文件。步骤如下:

1、下载源码:

git clone https://github.com/cloudera/sentry.git

切换到1.5.1-cdh5.16.1这个tag

git checkout -b cdh5.16.1-release cdh5.16.2-release

源码结构:

Apache Sentry实战之旅(一)—— Impala+Sentry整合

2、编译打包:

mvn -Dmaven.test.skip=true clean package

编译打包完成后,生成的Sentry安装包在下图所标识的目录下:

Apache Sentry实战之旅(一)—— Impala+Sentry整合

3、设置环境变量:

解压Sentry压缩包到指定目录下,同时下载hadoop-2.6.0-cdh5.16.1.tar.gz并解压,编辑/etc/profile,设置HadoopSentry环境变量:

HADOOP_HOME=/data/sentry/hadoop-2.6.0-cdh5.16.1
HADOOP_LIBEXEC_DIR=${HADOOP_HOME}/libexec
SENTRY_HOME=/data/sentry/apache-sentry-1.5.1-cdh5.16.1-bin
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_LIBEXEC_DIR:$SENTRY_HOME/bin:$PATH

4、配置sentry-site.xml

转到Sentry解压目录的conf文件夹下,修改sentry-site.xml配置文件:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->

<configuration>
   <property>
      <name>sentry.service.server.rpc-address</name>
      <value>hadoop21-test1-rgtj5-tj1</value>
  </property>

  <property>
      <name>sentry.service.server.rpc-port</name>
      <value>8038</value>
  </property>

  <property>
      <name>sentry.service.admin.group</name>
      <value>hadoop</value>
  </property>

  <property>
      <name>sentry.service.allow.connect</name>
      <value>hadoop</value>
  </property>

  <property>
      <name>sentry.store.group.mapping</name>
      <value>org.apache.sentry.provider.common.HadoopGroupMappingService</value>
  </property>
      
  <property>
      <name>sentry.service.reporting</name>
      <value>JMX</value>
  </property>

  <property>
      <name>sentry.service.web.enable</name>
      <value>true</value>
  </property>

  <property> 
      <name>sentry.service.web.port</name>  
      <value>51000</value> 
  </property>  

  <property> 
      <name>sentry.service.web.authentication.type</name>  
      <value>NONE</value> 
  </property> 
    
  <property>
      <name>sentry.verify.schema.version</name>
      <value>true</value>  
  </property>

  <property>
    <name>sentry.service.security.mode</name>
    <value>none</value>
  </property>

  <property>
    <name>sentry.store.jdbc.url</name>
    <value>jdbc:mysql://localhost:3306/sentry_test?useSSL=false</value>
  </property>

  <property>
      <name>sentry.store.jdbc.driver</name>
      <value>com.mysql.jdbc.Driver</value>
  </property>

  <property>
      <name>sentry.store.jdbc.user</name>
      <value>root</value>
  </property>

  <property>
      <name>sentry.store.jdbc.password</name>
      <value>123456</value>
  </property>
</configuration>

5、创建MySQL数据库表:

CREATE DATABASE `sentry_test` /*!40100 DEFAULT CHARACTER SET utf8 */;

6、初始化Sentry数据库表:

mysql-connector-java-5.1.47.jar放到Sentry解压目录的lib文件夹下,然后执行以下命令创建Sentry数据库表:

sentry --command schema-tool --conffile  ${SENTRY_HOME}/conf/sentry-site.xml --dbType mysql --initSchema

显示以下信息表示连接到数据库并初始化数据库表成功:

Sentry store connection URL:     jdbc:mysql://localhost:3306/sentry_test?useSSL=false
Sentry store Connection Driver :         com.mysql.jdbc.Driver
Sentry store connection User:    root
Starting sentry store schema initialization to 1.5.0-cdh5-2
Initialization script sentry-mysql-1.5.0-cdh5-2.sql
Connecting to jdbc:mysql://localhost:3306/sentry_test?useSSL=false
Connected to: MySQL (version 5.6.24-72.2-log)
Driver: MySQL Connector Java (version mysql-connector-java-5.1.47 ( Revision: fe1903b1ecb4a96a917f7ed3190d80c049b1de29 ))
Transaction isolation: TRANSACTION_REPEATABLE_READ
Autocommit status: true
No rows affected (0.006 seconds)
No rows affected (0.001 seconds)
No rows affected (0.002 seconds)
No rows affected (0.001 seconds)
No rows affected (0.001 seconds)
No rows affected (0.001 seconds)
No rows affected (0.004 seconds)
No rows affected (0.001 seconds)
No rows affected (0.001 seconds)
No rows affected (0.001 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.003 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.016 seconds)
No rows affected (0.007 seconds)
No rows affected (0.006 seconds)
No rows affected (0.007 seconds)
No rows affected (0.006 seconds)
No rows affected (0.005 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.005 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.006 seconds)
No rows affected (0.006 seconds)
No rows affected (0.003 seconds)
No rows affected (0.005 seconds)
No rows affected (0.002 seconds)
No rows affected (0.004 seconds)
1 row affected (0.002 seconds)
No rows affected (0.003 seconds)
No rows affected (0.007 seconds)
No rows affected (0.005 seconds)
No rows affected (0.004 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
No rows affected (0.006 seconds)
No rows affected (0.005 seconds)
No rows affected (0.002 seconds)
No rows affected (0.006 seconds)
No rows affected (0.002 seconds)
No rows affected (0.004 seconds)
No rows affected (0.003 seconds)
No rows affected (0.004 seconds)
No rows affected (0.005 seconds)
No rows affected (0.003 seconds)
No rows affected (0.006 seconds)
No rows affected (0.006 seconds)
No rows affected (0.006 seconds)
No rows affected (0.003 seconds)
No rows affected (0.003 seconds)
No rows affected (0.006 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.002 seconds)
No rows affected (0.006 seconds)
No rows affected (0.005 seconds)
No rows affected (0.004 seconds)
No rows affected (0.006 seconds)
No rows affected (0.002 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
Closing: 0: jdbc:mysql://localhost:3306/sentry_test?useSSL=false
Initialization script completed
Sentry schemaTool completed

7、运行Sentry命令,启动Sentry服务端:

nohup sentry --command service --conffile ${SENTRY_HOME}/conf/sentry-site.xml>sentry.out 2>&1 & 

在浏览器输入以下地址访问Sentry Web UI,验证是否安装成功:

http://localhost:51000/

Web UI如下图所示:

Apache Sentry实战之旅(一)—— Impala+Sentry整合

Impala+Sentry整合

1、引入Sentry依赖

apache-sentry-1.5.1-cdh5.16.1-bin/lib目录下相关jar拷贝到/usr/lib/impala/lib目录下,或者使用如下命令建立Sentry jar包的软链接也行:

#!/bin/bash
SENTRY_HOME=/data/impala/apache-sentry-1.5.1-cdh5.16.1-bin

sudo rm -rf /usr/lib/impala/lib/sentry-*.jar

sudo ln -s $SENTRY_HOME/lib/sentry-binding-hive-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-binding-hive.jar
sudo ln -s $SENTRY_HOME/lib/sentry-core-common-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-core-common.jar
sudo ln -s $SENTRY_HOME/lib/sentry-core-model-db-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-core-model-db.jar
sudo ln -s $SENTRY_HOME/lib/sentry-core-model-kafka-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-core-model-kafka.jar
sudo ln -s $SENTRY_HOME/lib/sentry-core-model-search-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-core-model-search.jar
sudo ln -s $SENTRY_HOME/lib/sentry-policy-common-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-policy-common.jar
sudo ln -s $SENTRY_HOME/lib/sentry-policy-db-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-policy-db.jar
sudo ln -s $SENTRY_HOME/lib/sentry-policy-kafka-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-policy-kafka.jar
sudo ln -s $SENTRY_HOME/lib/sentry-policy-search-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-policy-search.jar
sudo ln -s $SENTRY_HOME/lib/sentry-provider-cache-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-provider-cache.jar
sudo ln -s $SENTRY_HOME/lib/sentry-provider-common-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-provider-common.jar
sudo ln -s $SENTRY_HOME/lib/sentry-provider-db-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-provider-db-sh.jar
sudo ln -s $SENTRY_HOME/lib/sentry-provider-file-1.5.1-cdh5.16.1.jar /usr/lib/impala/lib/sentry-provider-file.jar

最终的Sentry jar包依赖如下:

lrwxrwxrwx 1 root root       90 Jul  6 11:00 sentry-binding-hive.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-binding-hive-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       89 Jul  6 11:00 sentry-core-common.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-core-common-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       91 Jul  6 11:00 sentry-core-model-db.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-core-model-db-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       94 Jul  6 11:00 sentry-core-model-kafka.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-core-model-kafka-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       95 Jul  6 11:00 sentry-core-model-search.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-core-model-search-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       91 Jul  6 11:00 sentry-policy-common.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-policy-common-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       87 Jul  6 11:00 sentry-policy-db.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-policy-db-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       90 Jul  6 11:00 sentry-policy-kafka.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-policy-kafka-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       91 Jul  6 11:00 sentry-policy-search.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-policy-search-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       92 Jul  6 11:00 sentry-provider-cache.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-provider-cache-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       93 Jul  6 11:00 sentry-provider-common.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-provider-common-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       89 Jul  6 11:00 sentry-provider-db-sh.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-provider-db-1.5.1-cdh5.16.1.jar
lrwxrwxrwx 1 root root       91 Jul  6 11:00 sentry-provider-file.jar -> /data/impala/apache-sentry-1.5.1-cdh5.16.1-bin/lib/sentry-provider-file-1.5.1-cdh5.16.1.jar

注意:使用sentryImpalaCDH版本一定要对应上,比如我这里使用的是Impala版本是CDH5.16.1,那么sentry的也需要是,否则会因为jar版本问题导致Impala启动过程中抛出一些异常,比如:

java.lang.NoClassDefFoundError: org/apache/sentry/provider/cache/SentryPrivilegeCache

如果不知道Impala依赖的一些外部组件的版本,可以在Impala源码的Impala/bin/impala-config.sh里找到,如Impala cdh5-2.12.0_5.16.1版本的依赖信息在该配置文件里定义如下:

# Versions of Hadoop ecosystem dependencies.
# ------------------------------------------
export CDH_MAJOR_VERSION=5
export IMPALA_HADOOP_VERSION=2.6.0-cdh5.16.1
unset IMPALA_HADOOP_URL
export IMPALA_HBASE_VERSION=1.2.0-cdh5.16.1
unset IMPALA_HBASE_URL
export IMPALA_HIVE_VERSION=1.1.0-cdh5.16.1
unset IMPALA_HIVE_URL
export IMPALA_SENTRY_VERSION=1.5.1-cdh5.16.1
unset IMPALA_SENTRY_URL
export IMPALA_PARQUET_VERSION=1.5.0-cdh5.16.1
export IMPALA_LLAMA_MINIKDC_VERSION=1.0.0
unset IMPALA_LLAMA_MINIKDC_URL
export IMPALA_KITE_VERSION=1.0.0-cdh5.16.1

2、创建sentry-site.xml

apache-sentry-1.5.1-cdh5.16.1-bin/conf目录下的sentry-site.xml.service.template文件拷贝到/etc/impala/conf目录下:

# 拷贝
cp apache-sentry-1.5.1-cdh5.16.1-bin/conf/sentry-site.xml.service.template /etc/impala/conf/
# 重命名
cd /etc/impala/conf/
mv sentry-site.xml.service.template sentry-site.xml

编辑sentry-site.xml为以下内容:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->

<!-- WARNING!!! This file is provided for documentation purposes ONLY!              -->
<!-- WARNING!!! You should copy to sentry-site.xml and make modification instead.   -->

<configuration>

  <!--Sentry Server端口-->
  <property>
     <name>sentry.service.client.server.rpc-port</name>
     <value>8038</value>
  </property>

  <!--Sentry Server服务器地址-->
  <property>
     <name>sentry.service.client.server.rpc-addresses</name>
     <value>hadoop21-test1-rgtj5-tj1</value>
  </property>

  <!--客户端连接Sentry Server超时时间,以毫秒为单位,默认为200000毫秒-->
  <property>
     <name>sentry.service.client.server.rpc-connection-timeout</name>
     <value>200000</value>
  </property>

  <!--权限存储方式:数据库或者ini配置文件-->
  <property>
    <name>sentry.hive.provider.backend</name>
    <value>org.apache.sentry.provider.db.SimpleDBProviderBackend</value>
  </property>

  <!--权限认证方式,支持Kerberos认证,设置为none表示不启用认证  -->
  <property>
     <name>sentry.service.security.mode</name>
     <value>none</value>
  </property>

</configuration>

3、启用权限认证

编辑/etc/default/impala配置文件,修改如下两个配置启用Sentry权限认证:

  • 修改IMPALA_CATALOG_ARGS选项,增加-sentry_config=/etc/impala/conf/sentry-site.xml配置
  • 修改IMPALA_SERVER_ARGS选项,增加-sentry_config=/etc/impala/conf/sentry-site.xml-server_name=sentryserver配置

配置文件最终内容如下:

IMPALA_CATALOG_SERVICE_HOST=hadoop21-test1-rgtj5-tj1
IMPALA_STATE_STORE_HOST=hadoop21-test1-rgtj5-tj1
IMPALA_STATE_STORE_PORT=24000
IMPALA_BACKEND_PORT=22000
IMPALA_LOG_DIR=/data/log/impala

IMPALA_CATALOG_ARGS=" -log_dir=${IMPALA_LOG_DIR} -sentry_config=/etc/impala/conf/sentry-site.xml"
IMPALA_STATE_STORE_ARGS=" -log_dir=${IMPALA_LOG_DIR} -state_store_port=${IMPALA_STATE_STORE_PORT}"
IMPALA_SERVER_ARGS=" \
    -log_dir=${IMPALA_LOG_DIR} \
    -catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \
    -state_store_port=${IMPALA_STATE_STORE_PORT} \
    -use_statestore \
    -state_store_host=${IMPALA_STATE_STORE_HOST} \
    -be_port=${IMPALA_BACKEND_PORT} \
    -kudu_master_hosts=hadoop21-test1-rgtj5-tj1:7051,hadoop20-test1-rgtj5-tj1:7051,hadoop22-test1-rgtj5-tj1:7051,hadoop-bi06-test1-rgtj5-tj1:7051,hadoop-bi07-test1-rgtj5-tj1:7051 \
    -sentry_config=/etc/impala/conf/sentry-site.xml \
    -server_name=sentryserver"

ENABLE_CORE_DUMPS=true

# LIBHDFS_OPTS=-Djava.library.path=/usr/lib/impala/lib
# MYSQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar
# IMPALA_BIN=/usr/lib/impala/sbin
# IMPALA_HOME=/usr/lib/impala
# HIVE_HOME=/usr/lib/hive
# HBASE_HOME=/usr/lib/hbase
# IMPALA_CONF_DIR=/etc/impala/conf
# HADOOP_CONF_DIR=/etc/impala/conf
# HIVE_CONF_DIR=/etc/impala/conf
# HBASE_CONF_DIR=/etc/impala/conf

4、重启Impala服务,验证权限

重启Impala服务:

sudo service impala-state-store restart
sudo service impala-catalog restart
sudo service impala-server restart

打开impala-shell,验证权限配置是否成功,具体操作如下:

(1)切换到hadoop用户,打开impala-shell,创建一个admin角色:

[hadoop21-test1-rgtj5-tj1:21000] > create role admin_role;
Query: create role admin_role
Fetched 0 row(s) in 0.35s

(2)为admin角色赋予超级权限:

[hadoop21-test1-rgtj5-tj1:21000] > GRANT ALL ON SERVER sentryserver TO ROLE admin_role;
Query: GRANT ALL ON SERVER sentryserver TO ROLE admin_role
Query submitted at: 2019-07-06 10:40:11 (Coordinator: http://hadoop21-test1-rgtj5-tj1:25000)
Query progress can be monitored at: http://hadoop21-test1-rgtj5-tj1:25000/query_plan?query_id=15475b39691bd167:66c1403300000000
Fetched 0 row(s) in 0.13s

(3)将admin角色授权给hadoop用户组:

[hadoop21-test1-rgtj5-tj1:21000] > GRANT ROLE admin_role TO GROUP hadoop;
Query: GRANT ROLE admin_role TO GROUP hadoop
Query submitted at: 2019-07-06 10:41:53 (Coordinator: http://hadoop21-test1-rgtj5-tj1:25000)
Query progress can be monitored at: http://hadoop21-test1-rgtj5-tj1:25000/query_plan?query_id=434bb908587eaf31:65887a5a00000000
Fetched 0 row(s) in 0.11s

(4)创建一个test库和test表,并插入测试数据:

[hadoop21-test1-rgtj5-tj1:21000] > create database test;    
Query: create database test
Fetched 0 row(s) in 0.29s
[hadoop21-test1-rgtj5-tj1:21000] > use test;
Query: use test
[hadoop21-test1-rgtj5-tj1:21000] > CREATE TABLE test(x INT, y STRING) STORED AS PARQUET; 
Query: CREATE TABLE test(x INT, y STRING) STORED AS PARQUET
Fetched 0 row(s) in 0.16s
[hadoop21-test1-rgtj5-tj1:21000] > INSERT INTO test VALUES (1, 'one'), (2, 'two'), (3, 'three'); 
Query: INSERT INTO test VALUES (1, 'one'), (2, 'two'), (3, 'three')
Query submitted at: 2019-07-06 11:18:33 (Coordinator: http://hadoop21-test1-rgtj5-tj1:25000)
Query progress can be monitored at: http://hadoop21-test1-rgtj5-tj1:25000/query_plan?query_id=ce4e7f66f1209531:641f39a900000000
Modified 3 row(s) in 5.47s

因为hadoop用户是超级管理员并拥有ALL的权限,因此执行以下SELECT语句便能很快看到我们刚插入的数据:

[hadoop21-test1-rgtj5-tj1:21000] > select * from test;
Query: select * from test
Query submitted at: 2019-07-06 11:19:50 (Coordinator: http://hadoop21-test1-rgtj5-tj1:25000)
Query progress can be monitored at: http://hadoop21-test1-rgtj5-tj1:25000/query_plan?query_id=34e4b5594e3d0c6:8cfb1acb00000000
+---+-------+
| x | y     |
+---+-------+
| 1 | one   |
| 2 | two   |
| 3 | three |
+---+-------+
Fetched 3 row(s) in 1.87s

接着我们切换到root用户,运行impala-shell,对我们刚刚创建的test库进行操作:

[hadoop21-test1-rgtj5-tj1:21000] > use test;
Query: use test
ERROR: AuthorizationException: User 'root' does not have privileges to access: test.*.*

提示root用户没有操作test库的权限,至此,说明Sentry权限认证已经生效。

各种授权操作语法如下:

创建角色:CREATE ROLE <role name>
组分配角色:GRANT ROLE <role name> TO GROUP <group name>
服务级赋权:GRANT <ALL|SELECT|UPDATE> ON SERVER <server name> TO ROLE <role name>
数据库赋权:GRANT <ALL|SELECT|UPDATE> ON DATABASE <database name> TO ROLE <role name>
表赋权:GRANT <ALL|SELECT|UPDATE> ON TABLE <database name>.<table name> TO ROLE <role name>
字段权限:GRANT SELECT(column name)ON TABLE <table name> TO ROLE <role name>;
回收组权限:REVOKE ROLE <role name> FROM GROUP <group name>
回收字段权限:REVOKE SELECT <column name> ON TABLE <table name> FROM ROLE <role name>;
回收数据库权限:REVOKE <ALL|SELECT|UPDATE> ON DATABASE <database name> FROM ROLE <role name>
查看某个角色的权限:SHOW GRANT ROLE <role name>
各种查看命令:
SHOW ROLES;
SHOW CURRENT ROLES;
SHOW ROLE GRANT GROUP <group name>;
SHOW GRANT ROLE <role name>;
SHOW GRANT ROLE <role name> on OBJECT <object name>;

总结

1、Impala服务的权限安全,认证(Kerberos/LDAP)是第一步,授权(Sentry)是第二步。如果要启用授权,必须先启用认证。本文在测试过程中不启用认证而只启用Sentry授权,强烈不建议在生产系统中这样使用,因为如果没有用户认证,授权没有任何意义形同虚设,用户可以随意使用任何超级用户登录Impala,并不会做密码校验。

2、Impala是不区分底层存储用户的,Sentry控制的只是Impala应用层的操作权限,底层操作HDFS的还是impala用户,也就是启动impalad的用户。不区分底层存储用户主要是因为C++libhdfsHadoop2时还不支持doAs

3、Impala中的授权处理过程类似于Hive中的授权处理过程,主要的区别在于权限信息的缓存。ImpalaCatalog服务管理并缓存数据库schema元数据和Sentry权限元数据,并将其传播到所有Impala Server节点。因此,Impala中的授权验证在本地进行,而且速度更快。可以用下图进行概括:

Apache Sentry实战之旅(一)—— Impala+Sentry整合

参考资料

官方资料:

博客文章:

Hadoop实操公众号:

点赞
收藏
评论区
推荐文章
blmius blmius
4年前
MySQL:[Err] 1292 - Incorrect datetime value: ‘0000-00-00 00:00:00‘ for column ‘CREATE_TIME‘ at row 1
文章目录问题用navicat导入数据时,报错:原因这是因为当前的MySQL不支持datetime为0的情况。解决修改sql\mode:sql\mode:SQLMode定义了MySQL应支持的SQL语法、数据校验等,这样可以更容易地在不同的环境中使用MySQL。全局s
美凌格栋栋酱 美凌格栋栋酱
9个月前
Oracle 分组与拼接字符串同时使用
SELECTT.,ROWNUMIDFROM(SELECTT.EMPLID,T.NAME,T.BU,T.REALDEPART,T.FORMATDATE,SUM(T.S0)S0,MAX(UPDATETIME)CREATETIME,LISTAGG(TOCHAR(
Wesley13 Wesley13
4年前
MySQL部分从库上面因为大量的临时表tmp_table造成慢查询
背景描述Time:20190124T00:08:14.70572408:00User@Host:@Id:Schema:sentrymetaLast_errno:0Killed:0Query_time:0.315758Lock_
皕杰报表之UUID
​在我们用皕杰报表工具设计填报报表时,如何在新增行里自动增加id呢?能新增整数排序id吗?目前可以在新增行里自动增加id,但只能用uuid函数增加UUID编码,不能新增整数排序id。uuid函数说明:获取一个UUID,可以在填报表中用来创建数据ID语法:uuid()或uuid(sep)参数说明:sep布尔值,生成的uuid中是否包含分隔符'',缺省为
待兔 待兔
1年前
手写Java HashMap源码
HashMap的使用教程HashMap的使用教程HashMap的使用教程HashMap的使用教程HashMap的使用教程22
Jacquelyn38 Jacquelyn38
4年前
2020年前端实用代码段,为你的工作保驾护航
有空的时候,自己总结了几个代码段,在开发中也经常使用,谢谢。1、使用解构获取json数据let jsonData  id: 1,status: "OK",data: 'a', 'b';let  id, status, data: number   jsonData;console.log(id, status, number )
Stella981 Stella981
4年前
Apache Sentry实战之旅(二)—— Sentry客户端使用
ApacheSentry虽然可以将HDFS、Hive与Impala三个组件的权限认证统一,但是只能按照给组授予角色的方式来进行授权,不能直接授权给组中的用户,显得不太灵活。有时候为了兼容已有大数据平台的授权体系,比如只使用Sentry控制Impala服务的权限,而不控制Hive和HDFS服务的权限,希望通过调用Sentry客
Wesley13 Wesley13
4年前
00:Java简单了解
浅谈Java之概述Java是SUN(StanfordUniversityNetwork),斯坦福大学网络公司)1995年推出的一门高级编程语言。Java是一种面向Internet的编程语言。随着Java技术在web方面的不断成熟,已经成为Web应用程序的首选开发语言。Java是简单易学,完全面向对象,安全可靠,与平台无关的编程语言。
Stella981 Stella981
4年前
Django中Admin中的一些参数配置
设置在列表中显示的字段,id为django模型默认的主键list_display('id','name','sex','profession','email','qq','phone','status','create_time')设置在列表可编辑字段list_editable
Python进阶者 Python进阶者
1年前
Excel中这日期老是出来00:00:00,怎么用Pandas把这个去除
大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas数据筛选的问题。问题如下:这日期老是出来00:00:00,怎么把这个去除。二、实现过程后来【论草莓如何成为冻干莓】给了一个思路和代码如下:pd.toexcel之前把这