hive的order by操作
时间:2022-07-25
本文章向大家介绍hive的order by操作,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。
Hive中常见的高级查询包括:group by
、Order by
、join
、distribute by
、sort by
、cluster by
、Union all
。今天我们来看看order by
操作,Order by
表示按照某些字段排序,语法如下:
select col,col2...
from tableName
where condition
order by col1,col2 [asc|desc]
注意: (1) order by后面可以有多列进行排序,默认按字典排序。
(2) order by为全局排序。
(3) order by需要reduce操作,且只有一个reduce,无法配置(因为多个reduce无法完成全局排序)。
order by操作会受到如下属性的制约:
set hive.mapred.mode=nonstrict; (default value / 默认值)
set hive.mapred.mode=strict;
注:如果在strict模式下使用order by语句,那么必须要在语句中加上limit关键字,因为执行order by的时候只能启动单个reduce,如果排序的结果集过大,那么执行时间会非常漫长。
下面我们通过一个示例来深入体会order by的用法:
数据库有一个employees表,数据如下:
hive> select * from employees;
OK
lavimer 15000.0 ["li","lu","wang"] {"k1":1.0,"k2":2.0,"k3":3.0} {"street":"dingnan","city":"ganzhou","num":101} 2015-01-24 love
liao 18000.0 ["liu","li","huang"] {"k4":2.0,"k5":3.0,"k6":6.0} {"street":"dingnan","city":"ganzhou","num":102} 2015-01-24 love
zhang 19000.0 ["xiao","wen","tian"] {"k7":7.0,"k8":8.0,"k8":8.0} {"street":"dingnan","city":"ganzhou","num":103} 2015-01-24 love
现在我要按第二列(salary)降序排列:
hive> select * from employees order by salary desc;
//执行MapReduce的过程
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 2.62 sec HDFS Read: 415 HDFS Write: 245 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 620 msec
OK
zhang 19000.0 ["xiao","wen","tian"] {"k7":7.0,"k8":8.0} {"street":"dingnan","city":"ganzhou","num":103} 2015-01-24 love
liao 18000.0 ["liu","li","huang"] {"k4":2.0,"k5":3.0,"k6":6.0} {"street":"dingnan","city":"ganzhou","num":102} 2015-01-24 love
lavimer 15000.0 ["li","lu","wang"] {"k1":1.0,"k2":2.0,"k3":3.0} {"street":"dingnan","city":"ganzhou","num":101} 2015-01-24 love
Time taken: 20.484 seconds
hive>
此时的hive.mapred.mode属性为:
hive> set hive.mapred.mode;
hive.mapred.mode=nonstrict
hive>
现在我们将它改为strict,然后再使用order by进行查询:
hive> set hive.mapred.mode=strict;
hive> select * from employees order by salary desc;
FAILED: Error in semantic analysis: 1:33 In strict mode, if ORDER BY is specified, LIMIT must also be specified. Error encountered near token 'salary'
hive>
注:在strict模式下查询必须加上limit关键字。
hive> select * from employees order by salary desc limit 3;
FAILED: Error in semantic analysis: No partition predicate found for Alias "employees" Table "employees"
注:另外还有一个要注意的是strict模式也会限制分区表的查询,解决方案是必须指定分区
先来看看分区:
hive> show partitions employees;
OK
date_time=2015-01-24/type=love
Time taken: 0.096 seconds
在strict模式先使用order by查询:
hive> select * from employees where partition(date_time='2015-01-24',type='love') order by salary desc limit 3;
FAILED: Parse Error: line 1:30 cannot recognize input near 'partition' '(' 'date_time' in expression specification
hive
> select * from employees where date_time='2015-01-24' and type='love' order by salary desc limit 3;
//执行MapReduce程序
Total MapReduce CPU Time Spent: 3 seconds 510 msec
OK
zhang 19000.0 ["xiao","wen","tian"] {"k7":7.0,"k8":8.0} {"street":"dingnan","city":"ganzhou","num":103} 2015-01-24 love
liao 18000.0 ["liu","li","huang"] {"k4":2.0,"k5":3.0,"k6":6.0} {"street":"dingnan","city":"ganzhou","num":102} 2015-01-24 love
lavimer 15000.0 ["li","lu","wang"] {"k1":1.0,"k2":2.0,"k3":3.0} {"street":"dingnan","city":"ganzhou","num":101} 2015-01-24 love
Time taken: 19.861 seconds
hive>
- linux下用户操作记录审计环境的部署记录
- open-falcon ---客户机agent操作
- 双拼域名yansuan.com被木雨林收购
- open-falcon ---安装Dashboard时候报错"SSLError: The read operation timed out"
- Flash/Flex学习笔记(37):不用系统组件(纯AS3)的视频播放器--只有8.82K
- Flash/Flex学习笔记(35):如何正确监听Stage对象的事件
- Flash/Flex学习笔记(34):AS3中的自定义事件
- 字符串处理总结(旧)
- 计算某年某月的某一天是星期几的算法
- Flash/Flex学习笔记(33):如何用As3协同Flash CS IDE控制MovieClip实例
- live writer的折腾
- 用C语言写的万年历---亲手写的。好累哦
- Flash/Flex学习笔记(32):播放音乐并同步显示lyc歌词(适用于Silverlight)
- Flash/Flex学习笔记(31):对象拖拽与投掷
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- 你知道C语言中的危险函数吗?
- python与安全(四)shell反弹
- 怎么快速修复 bug ?
- 变量类型(cpu/gpu)
- 【Go语言学习】匿名函数与闭包
- Python中两种UnboundLocalError的解决方法
- 传智播客OA项目学习--阶段二(实体设计及技巧)
- 小生归一(五)md5扩展长度攻击
- 传智播客OA项目学习--阶段二(系统管理模块)
- main函数中的argc和argv到底是个啥?
- 传智播客OA项目学习--阶段一(2、框架整合)
- 微信XML消息model定义之微信公众平台(一)
- 自动返回笑话接口调用之微信公众平台(二)
- 「实战」 缘分使我们(骗子)相遇
- json-lib简单使用之微信公众平台(三)