需求五:设备平台相关指标
实现思路
因为设备平台
相关指标都是针对公用字段,例如,platform
、net
、brand
、model
和os
的统计,所以用DWD层
的任何一张表都可以完成统计,这里采用dwd_app_open
表。
首先,利用星型模型,基于
dwd_app_open
表,在DWS层
构建对应的六张维度表,对应的表名如下。设备平台活跃用户分布
:dws_user_platform_distrib
。Android系统活跃用户分布
:dws_user_android_os_distrib
。IOS系统活跃用户分布
:dws_user_ios_os_distrib
。设备品牌活跃用户分布
:dws_user_brand_distrib
。设备型号活跃用户分布
:dws_user_model_distrib
。网络类型活跃用户分布
:dws_user_net_distrib
。
然后,在
DWS层
基于以上的六张维度表,创建对应的维度聚合表,按天创建分区。接着,基于
DWS层
的聚合数据,再执行全局聚合计算(不按日期统计),并将最终聚合的结果保存到APP层
对应的表中,对应的表名如下。设备平台活跃用户分布
:app_user_platform_distrib
。Android系统活跃用户分布
:app_user_android_os_distrib
。IOS系统活跃用户分布
:app_user_ios_os_distrib
。设备品牌活跃用户分布
:app_user_brand_distrib
。设备型号活跃用户分布
:app_user_model_distrib
。网络类型活跃用户分布
:app_user_net_distrib
。
最后,每天为
APP层
重新生成全局数据。
DWS层开发
根据实现思路,需要在DWS层
初始化对应的六张维度表。
> cd /home/work/warehouse_user_action
> vi dws_shopmall_requirement05_init.sh
#!/bin/bash
# 需求五:设备平台相关指标
# dws层数据库和表初始化,只需要执行一次
hive -e "
create database if not exists dws_shopmall;
create external table if not exists dws_shopmall.dws_user_platform_distrib (
clazz string,
num int
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/user_platform_distrib';
create external table if not exists dws_shopmall.dws_user_android_os_distrib (
clazz string,
num int
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/user_android_os_distrib';
create external table if not exists dws_shopmall.dws_user_ios_os_distrib (
clazz string,
num int
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/user_ios_os_distrib';
create external table if not exists dws_shopmall.dws_user_brand_distrib (
clazz string,
num int
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/user_brand_distrib';
create external table if not exists dws_shopmall.dws_user_model_distrib (
clazz string,
num int
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/user_model_distrib';
create external table if not exists dws_shopmall.dws_user_net_distrib (
clazz string,
num int
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/user_net_distrib';
"
再来创建各个维度表对应的维度聚合表,并给它们添加分区。
> cd /home/work/warehouse_user_action
> vi dws_shopmall_requirement05_add_partition.sh
#!/bin/bash
# 需求五:设备平台相关指标
# 每天凌晨执行一次
if [ "d$1" = "d" ]
then
dt=`date +%Y%m%d --date="1 days ago"`
else
dt=$1
fi
hive -e "
insert overwrite table dws_shopmall.dws_user_platform_distrib partition(dt='${dt}') select
case platform
when 1 then 'android'
when 2 then 'ios'
end clazz,
count(*) as num
from dws_shopmall.dwd_app_open
where dt = '${dt}' and platform in (1,2)
group by platform;
insert overwrite table dws_shopmall.dws_user_android_os_distrib partition(dt='${dt}') select
osver as clazz,
count(*) as num
from dws_shopmall.dwd_app_open
where dt = '${dt}' and platform = 1
group by os;
insert overwrite table dws_shopmall.dws_user_ios_os_distrib partition(dt='${dt}') select
osver as clazz,
count(*) as num
from dws_shopmall.dwd_app_open
where dt = '${dt}' and platform = 2
group by os;
insert overwrite table dws_shopmall.dws_user_brand_distrib partition(dt='${dt}') select
brand as clazz,
count(*) as num
from dws_shopmall.dwd_app_open
where dt = '${dt}'
group by brand;
insert overwrite table dws_shopmall.dws_user_model_distrib partition(dt='${dt}') select
model as clazz,
count(*) as num
from dws_shopmall.dwd_app_open
where dt = '${dt}'
group by model;
insert overwrite table dws_shopmall.dws_user_net_distrib partition(dt='${dt}') select
case net
when 0 then '未知'
when 1 then 'WIFI'
when 2 then '2G'
when 3 then '3G'
when 4 then '4G'
when 5 then '5G'
end clazz,
count(*) as num
from dws_shopmall.dwd_app_open
where dt = '${dt}'
group by net;
"
可以通过定时任务来执行它。
> crontab -e
# 每天凌晨0点1分执行
1 0 * * * /home/work/warehouse_user_action/dws_shopmall_requirement05_add_partition.sh
APP层开发
在APP层
初始化对应的全局聚合表。
> cd /home/work/warehouse_user_action
> vi app_shopmall_requirement05_init.sh
#!/bin/bash
# 需求五:设备平台相关指标
# app层数据库和表初始化,只需要执行一次
hive -e "
create database if not exists app_shopmall;
create external table if not exists app_shopmall.app_user_platform_distrib (
clazz string,
num int
) row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/user_platform_distrib';
create external table if not exists app_shopmall.app_user_android_os_distrib (
clazz string,
num int
) row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/user_android_os_distrib';
create external table if not exists app_shopmall.app_user_ios_os_distrib (
clazz string,
num int
) row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/user_ios_os_distrib';
create external table if not exists app_shopmall.app_user_brand_distrib (
clazz string,
num int
) row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/user_brand_distrib';
create external table if not exists app_shopmall.app_user_model_distrib (
clazz string,
num int
) row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/user_model_distrib';
create external table if not exists app_shopmall.app_user_net_distrib (
clazz string,
num int
) row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/user_net_distrib';
"
然后,每天为APP层
重新生成全局数据。
> cd /home/work/warehouse_user_action
> vi app_shopmall_requirement05_add_partition.sh
#!/bin/bash
# 需求五:设备平台相关指标
# 每天凌晨执行一次
if [ "d$1" = "d" ]
then
dt=`date +%Y%m%d --date="1 days ago"`
else
dt=$1
fi
hive -e "
insert overwrite table app_shopmall.app_user_platform_distrib select
clazz,
sum(num) as num
from dws_shopmall.dws_user_platform_distrib
group by clazz;
insert overwrite table app_shopmall.app_user_android_os_distrib select
clazz,
sum(num) as num
from dws_shopmall.dws_user_android_os_distrib
group by clazz;
insert overwrite table app_shopmall.app_user_ios_os_distrib select
clazz,
sum(num) as num
from dws_shopmall.dws_user_ios_os_distrib
group by clazz;
insert overwrite table app_shopmall.app_user_brand_distrib select
clazz,
sum(num) as num
from dws_shopmall.dws_user_brand_distrib
group by clazz;
insert overwrite table app_shopmall.app_user_model_distrib select
clazz,
sum(num) as num
from dws_shopmall.dws_user_model_distrib
group by clazz;
insert overwrite table app_shopmall.app_user_net_distrib select
clazz,
sum(num) as num
from dws_shopmall.dws_user_net_distrib
group by clazz;
"
可以通过定时任务来执行它。
> crontab -e
# 每天凌晨0点1分执行
1 0 * * * /home/work/warehouse_user_action/app_shopmall_requirement05_add_partition.sh
感谢支持
更多内容,请移步《超级个体》。