hadoop常用技巧
时间:2020-04-04 00:01:15
收藏:0
阅读:69
hadoop输入多个文件时获取输入文件名,目的是可以对多个输入文件进行区分
filepath = os.environ["map_input_file"]
filename = os.path.split(filepath)[-1]
- https://blog.csdn.net/catchertherye/article/details/44410645
- https://blog.csdn.net/modiziri/article/details/47661091
hadoop多路输出
hadoop touchz命令:创建一个空文件
- hadoop fs -touchz xx.done[check文件完成小技巧]
hadoop text命令:将源文件输出为文本格式
- hadoop fs -text
hadoop grep命令,grep on Hadoop
hadoop jar hadoop-streaming-1.2.1.jar -D mapreduce.job.name="grep" -D stream.non.zero.exit.is.failure=false -mapper "grep $1" -numReduceTasks 0 -input /INPUT -output /OUTPUT
hadoop动态修改任务优先级
- hadoop job -set-priority job_name HIGH
hadoop数据去重的例子
hadoop join的方法【待实现】
- http://www.crazyant.net/1112.html
- https://blog.csdn.net/levy_cui/article/details/68945542
- https://www.jianshu.com/p/9375de2411db
原文:https://www.cnblogs.com/baisheng/p/12629672.html
评论(0)