【Spark八十一】Hive in the spark assembly

编程技术  /  houtizong 发布于 3年前   73

Spark SQL supports most commonly used features of HiveQL. However, different HiveQL statements are executed in different manners:

  1. 1. DDL statements (e.g. CREATE TABLE, DROP TABLE, etc.) and commands (e.g. SET <key> = <value>, ADD FILE, ADD JAR, etc.)

    2. In most cases, Spark SQL simply delegates these statements to Hive, as they don’t need to issue any distributed jobs and don’t rely on the computation engine (Spark, MR, or Tez).

  2. SELECT queries, CREATE TABLE ... AS SELECT ... statements and insertions

    These statements are executed using Spark as the execution engine.

The Hive classes packaged in the assembly jar are used to provide entry points to Hive features, for example:

  1. 1. HiveQL parser
  2. 2. Talking to Hive metastore to execute DDL statements
  3. 3. Accessing UDF/UDAF/UDTF

As for the differences between Hive on Spark and Spark SQL’s Hive support, please refer to this article by Reynold: https://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html

请勿发布不友善或者负能量的内容。与人为善,比聪明更重要!

留言需要登陆哦

技术博客集 - 网站简介:
前后端技术:
后端基于Hyperf2.1框架开发,前端使用Bootstrap可视化布局系统生成

网站主要作用:
1.编程技术分享及讨论交流,内置聊天系统;
2.测试交流框架问题,比如:Hyperf、Laravel、TP、beego;
3.本站数据是基于大数据采集等爬虫技术为基础助力分享知识,如有侵权请发邮件到站长邮箱,站长会尽快处理;
4.站长邮箱:[email protected];

      订阅博客周刊 去订阅

文章归档

文章标签

友情链接

Auther ·HouTiZong
侯体宗的博客
© 2020 zongscan.com
版权所有ICP证 : 粤ICP备20027696号
PHP交流群 也可以扫右边的二维码
侯体宗的博客