LoginSignup
0
0

Microsoft Fabric workout - Generate numbers

Posted at

Power Query で記述してみたらってこともならべておく。

range

example: start 0 end 10 step 2

PySpark : Runtime 1.2
spark.range(0,10,2).show()

Spark SQL : Runtime 1.2
SELECT
    *
FROM
    range(0, 10, 2)

Power Query : range 相当
Table.FromColumns(
    {
        List.Generate(
            ()=> 0,
            each _ < 10,
            each _ + 2
        )
    },
    type table [id = Int64.Type]
)

sequence

Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions.
Supported types are: byte, short, integer, long, date, timestamp.

  • stop も含む

number

PySpark : Runtime 1.2
from pyspark.sql.functions import lit, sequence, explode

(spark.range(1)
    .select(
        explode(
            sequence(lit(1), lit(5))
        )
    )
).show()
Spark SQL : Runtime 1.2
SELECT
    explode(
        sequence(1, 5)
    )
Power Query
Table.ExpandListColumn(
    #table(
        type table [col = {number}],
        {
            {
                {1..5}
            }
        }
    ),
    "col"
)

date

PySpark : Runtime 1.2
from pyspark.sql.functions import lit, sequence, explode, to_date, expr

(spark.range(1)
    .select(
        explode(
            sequence(
                to_date(lit("2024-04-01"))
                , to_date(lit("2025-03-31"))
                , expr("interval 1 day")
            )
        ).alias("column_date")
    )
).show()
Spark SQL : Runtime 1.2
SELECT
    explode(
        sequence(
            to_date("2024-04-01")
            , to_date("2025-03-31")
            , interval 1 day
        )
    ) AS column_date
Power Query
Table.FromColumns(
    {
        List.Generate(
            ()=> Date.From("2024-04-01"),
            each _ <= Date.From("2025-03-31"),
            each _ + #duration(1, 0, 0, 0)
            // each Date.AddDays(_, 1)
        )
    },
    type table [column_date = date]
)

datetime

PySpark : Runtime 1.2
from pyspark.sql.functions import lit, sequence, explode, to_timestamp, expr

(spark.range(1)
    .select(
        explode(
            sequence(
                to_timestamp(lit("2024-04-01 00:00:00"))
                , to_timestamp(lit("2024-04-01 23:59:59"))
                , expr("INTERVAL '1:30' HOUR TO MINUTE")
            )
        ).alias("column_datetime")
    )
).show()
Spark SQL : Runtime 1.2
SELECT
    explode(
        sequence(
            to_timestamp("2024-04-01 00:00:00")
            , to_timestamp("2024-04-01 23:59:59")
            , INTERVAL '1:30' HOUR TO MINUTE
        )
    ) AS column_datetime

思ったこと🙄

TIMESTAMP の列では タイムゾーン情報は切り捨てて、UTC として。
TIMESTAMP_NTZ の列は Lakehouse - SQL Analytics endpoint でサポートされない。

その他

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0