Edited at

Hiveクエリでハマりやすいエラーの処方箋

More than 1 year has passed since last update.

HIVEクエリを書いていてハマったエラーと、その対処法を記載していきます。


WINDOW関数で集計範囲が異なる時のエラー

ROWS BETWEENかの指定が異なる物が混じってるときに発生するエラーです。

他と記述を合わせることで、エラーは解消しました。

FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. 

Underlying error: Expecting right window frame boundary for function lag((TOK_TABLE_OR_COL weight), 12)

エラーが発生するクエリの例

SELECT

last_value(weight) over(PARTITION BY pref_name,city_name ORDER BY year_month ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_weight,
lag(weight, 12) over(PARTITION BY pref_name,city_name ORDER BY year_month ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM
...


WINDOW関数のLAG/LEAD関数のハマリどころ

1つ前の値を次のように取ろうとすると、Prestoでは動きますがHiveではエラーとなります。

SELECT

--1つ前の値を取得するLAG関数
LAG(weight) over(partition by pref_name, city_name order by year_month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM
...

解決法は簡単です。次のようなエラーが起きたら、ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGと指定しましょう。

FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.

Underlying error: Expecting left window frame boundary for function LAG((TOK_TABLE_OR_COL weight)) org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@75a39d3e as _wcol0 to be unbounded. Found : 1

ここでもし、1 PRECEDINGUNBOUNDED PRECEDING に書き換えても、エラー出力にrightの差はあるが同様にエラーとなります。

SELECT

--1つ前の値を取得するLAG関数
LAG(weight) over(partition by pref_name, city_name order by year_month ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING)
FROM
...

FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.

Underlying error: Expecting right window frame boundary for function LAG((TOK_TABLE_OR_COL weight))