A custom UDTF can be created by extending the GenericUDTF abstract class and then implementing the initialize, process, and possibly close methods. The initialize method is called by Hive to notify the UDTF the argument types to expect. The UDTF must then return an object inspector corresponding to the row objects that the UDTF will generate. Once initialize() has been called, Hive will give rows to the UDTF using the process() method. While in process(), the UDTF can produce and forward rows to other operators by calling forward(). Lastly, Hive will call the close() method when all the rows have passed to the UDTF.
hive (default)> create function mp_explod as "tech.mapan.hive.udtf.ExplodUDTF"; OK Time taken: 0.03 seconds
hive (default)> select mp_explod('1001#Jack#18#1999-01-02#Male','#'); OK linetoword 1001 Jack 18 1999-01-02 Male Time taken: 12.29 seconds, Fetched: 5 row(s)
create function mp_explod as "tech.mapan.hive.udtf.ExplodUDTF2"; OK Time taken: 0.029 seconds
hive (prac)> select * from test03; OK test03.id test03.name 1001 jack#ma 1002 dong#liu 1003 poney#ma Time taken: 0.522 seconds, Fetched: 3 row(s)
hive (prac)> selectid,first_name,last_name from test03 lateral view mp_explod2(name,"#") temp as first_name,last_name; OK id first_name last_name 1001 jack ma 1002 dong liu 1003 poney ma Time taken: 13.393 seconds, Fetched: 3 row(s)
补充一点:
关于Lateral View在官网这样介绍:
Lateral View Syntax
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (','columnAlias)* fromClause: FROM baseTable (lateralView)*
Description
Lateral view is used in conjunction with user-defined table generating functions such as explode(). As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.