需求三:商品相关指标
原创大约 3 分钟
实现思路
针对
商品销售情况
。可以通过将
dwd_orderitem
、dwd_goods_info
和dwd_category
这三张表进行关联获取到商品名称、一级类目、订单总量、销售额这些属性。可以基于这些表构建一个
DWS层
的商品订单信息的宽表dws_goods_order_all
便于后期其它需求复用。最后基于这个宽表,将统计出来的结果保存到
APP层
的app_goods_sales
表中。
针对
商品品类偏好Top10
。可以在上一个指标的基础上,按一级类目进行分组,对类目下的订单总量进行排序再取
Top10
。最后将结果保存到
app_category_top10
中。
DWS层开发
先通过脚本的方式构建DWS层
商品订单信息的宽表dws_goods_order_all
。
> cd /home/work/warehouse_goods_order
> vi dws_shopmall_requirement03_init.sh
#!/bin/bash
# 需求三:商品相关指标
# dws层数据库和表初始化,只需要执行一次
hive -e "
create database if not exists dws_shopmall;
create external table if not exists dws_shopmall.dws_goods_order_all (
orderid bigint,
orderamount int,
orderprice double,
ordercreatetime string,
goodsid bigint,
goodsname string,
goodsdesc string,
goodsprice double,
goodscreatetime string,
categoryid int,
parentids string,
catename string,
level int,
catecreatetime string
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/dws/goods_order_all/';
"
再按照天来分区,并在每天的凌晨时执行一次。
> cd /home/work/warehouse_goods_order
> vi dws_shopmall_requirement03_add_partition.sh
#!/bin/bash
# 需求三:商品相关指标
# 每天凌晨执行一次
if [ "d$1" = "d" ]
then
dt=`date +%Y%m%d --date="1 days ago"`
else
dt=$1
fi
hive -e "
insert overwrite table dws_shopmall.dws_goods_order_all partition(dt = '${dt}') select
doi.orderid,
doi.amount as orderamount,
doi.price as orderprice,
doi.createtime as ordercreatetime,
doi.goodsid,
dgi.goodsname,
dgi.goodsdesc,
dgi.price as goodsprice,
dgi.createtime as goodscreatetime,
dc.id as categoryid,
dc.parentids,
dc.name as catename,
dc.level,
dc.createtime as catecreatetime
from dwd_shopmall.dwd_orderitem as doi
left join dwd_shopmall.dwd_goods_info as dgi
on doi.goodsid = dgi.id
left join dwd_shopmall.dwd_category as dc
on dgi.categoryid = dc.id
where doi.dt = '${dt}' and dgi.dt = '${dt}' and dc.dt = '${dt}';
"
可以通过定时任务来执行它。
> crontab -e
# 每天凌晨0点1分执行
1 0 * * * /home/work/warehouse_goods_order/dws_shopmall_requirement03_add_partition.sh
APP层开发
初始化APP层
的app_goods_sales
和app_category_top10
表。
> cd /home/work/warehouse_goods_order
> vi app_shopmall_requirement03_init.sh
#!/bin/bash
# 需求三:商品相关指标
# app层数据库和表初始化,只需要执行一次
hive -e "
create database if not exists app_shopmall;
create external table if not exists app_shopmall.app_goods_sales (
goodsname string,
catename string,
ordertotal bigint,
pricetotal decimal(10, 2)
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/goods_sales/';
create external table if not exists app_shopmall.app_category_top10(
catename string,
ordertotal bigint
) partitioned by(dt string)
row format delimited
fields terminated by '\t'
location 'hdfs://server01:9000/data/app/category_top10/';
"
再按照天来分区,并在每天的凌晨时执行一次。
> cd /home/work/warehouse_goods_order
> vi app_shopmall_requirement03_add_partition.sh
#!/bin/bash
# 需求三:商品相关指标
# 每天凌晨执行一次
if [ "d$1" = "d" ]
then
dt=`date +%Y%m%d --date="1 days ago"`
else
dt=$1
fi
hive -e "
insert overwrite table app_shopmall.goods_sales partition(dt = '${dt}') select
goodsname,
catename,
count(order_id) as ordertotal,
sum(orderamount * orderprice) as pricetotal
from dws_shopmall.dws_goods_order_all
where dt = '${dt}'
group by goodsname, catename;
insert overwrite table app_shopmall.app_category_top10 partition(dt = '${dt}') select
catename,
sum(ordertotal) as ordertotal
from app_shopmall.goods_sales
where dt = '${dt}'
group by catename
order by ordertotal desc
limit 10;
"
可以通过定时任务来执行它。
> crontab -e
# 每天凌晨0点1分执行
1 0 * * * /home/work/warehouse_goods_order/app_shopmall_requirement03_add_partition.sh
感谢支持
更多内容,请移步《超级个体》。