[Linux]结合awk删除hdfs指定日期前的数据

本文发布时间: 2019-Mar-22
业务背景约定五天前的HDFS数据为过期版本数据,写一个脚本自动删除过期版本数据$ hadoop fs -ls /user/pms/workspace/ouyangyewei/dataFound 9 itemsdrwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-01drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-02drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-03drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-04drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-05drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-06drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-07drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-08drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-09脚本实现# ---------------------------------------------------------## 删除历史版本(五天前的为过期版本数据)## ---------------------------------------------------------old_version=$(hadoop fs -ls /user/pms/workspace/ouyangyewei/data | awk 'BEGIN{ five_days_ago=strftime('%F', systime()-5*24*3600) }{ split($8,arr,'/'); if(arr[7]<five_days_ago){printf '%s', $8} }')arr=(${old_version// / })for version in ${arr[@]} do hadoop fs -rmr $versiondone执行以后$ hadoop fs -ls /user/pms/workspace/ouyangyewei/dataFound 4 itemsdrwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-06drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-07drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-08drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-09


(以上内容不代表本站观点。)
---------------------------------
本网站以及域名有仲裁协议。
本網站以及域名有仲裁協議。

2024-Mar-04 02:09pm
栏目列表