HIVEクエリを書いていてハマったエラーと、その対処法を記載していきます。
WINDOW関数で集計範囲が異なる時のエラー
ROWS BETWEENかの指定が異なる物が混じってるときに発生するエラーです。
他と記述を合わせることで、エラーは解消しました。
FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.
Underlying error: Expecting right window frame boundary for function lag((TOK_TABLE_OR_COL weight), 12)
エラーが発生するクエリの例
SELECT
last_value(weight) over(PARTITION BY pref_name,city_name ORDER BY year_month ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_weight,
lag(weight, 12) over(PARTITION BY pref_name,city_name ORDER BY year_month ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM
...
WINDOW関数のLAG/LEAD関数のハマリどころ
1つ前の値を次のように取ろうとすると、Prestoでは動きますがHiveではエラーとなります。
SELECT
--1つ前の値を取得するLAG関数
LAG(weight) over(partition by pref_name, city_name order by year_month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM
...
解決法は簡単です。次のようなエラーが起きたら、ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
と指定しましょう。
FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.
Underlying error: Expecting left window frame boundary for function LAG((TOK_TABLE_OR_COL weight)) org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@75a39d3e as _wcol0 to be unbounded. Found : 1
ここでもし、1 PRECEDING
をUNBOUNDED PRECEDING
に書き換えても、エラー出力にrightの差はあるが同様にエラーとなります。
SELECT
--1つ前の値を取得するLAG関数
LAG(weight) over(partition by pref_name, city_name order by year_month ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING)
FROM
...
FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.
Underlying error: Expecting right window frame boundary for function LAG((TOK_TABLE_OR_COL weight))