LoginSignup
0
0

Microsoft Fabric workout - Lakehouse - Date types

Last updated at Posted at 2024-04-04

扱うデータ型は重要だ。まずは多くつかうであろうデータ型とは?って考えた。

Dataflow Gen2 → Lakehouse

Destination type は Power BI セマンティックのモデルのデータ型と一致するようだ。

Power Query Destination type SQL name pyspark.sql.types
type text, Text.Type Text STRING StringType()
type number, Number.Type Decimal number DOUBLE DoubleType()
type date, Date.Type Date DATE DateType()
type datetime, DateTime.Type Date/Time TIMESTAMP TimestampType()
type datetimezone, DateTimeZone.Type -
type time, Time.Type -
type duration, Duration.Type -
type logical, Logical.Type True/False BOOLEAN BooleanType()
type binary, Binary.Type -
Decimal.Type Decimal number
Int64.Type Whole number BIGINT, LONG LongType()
Int32.Type Whole number INT, INTEGER IntegerType()
Int16.Type Whole number SMALLINT ShortType()
Int8.Type Whole number SMALLINT ShortType()
Currency.Type Currency DECIMAL(19, 4) DecimalType(19, 4)
Percentage.Type Percentage DECIMAL(34, 6)? DecimalType(34, 6)?
type table, Table.Type -
type record, Record.Type -
type list, List.Type -
type any, Any.Type -
type anynonnull -
Power Query
let
    Source = Table.FromRows(
        {{
            "0", 0, #date(2024, 4, 4), #datetime(2024,4,4,10,50,30),
            0, 0, 0, 0, 0, 0, 0, 0
        }},
        {
            "TextType", "NumberType", "DateType", "DateTimeType",
            "LogicalType", "DecimalType", "Int64Type", "Int32Type",
            "Int16Type", "Int8Type", "CurrencyType", "PercentageType"
        }
    ),
    ChangedType = Table.TransformColumnTypes(
        Source,
        {
            {"TextType", Text.Type},
            {"NumberType", Number.Type},
            {"DateType", Date.Type},
            {"DateTimeType", DateTime.Type},
            {"LogicalType", Logical.Type},
            {"DecimalType", Decimal.Type},
            {"Int64Type", Int64.Type},
            {"Int32Type", Int32.Type},
            {"Int16Type", Int16.Type},
            {"Int8Type", Int8.Type},
            {"CurrencyType", Currency.Type},
            {"PercentageType", Percentage.Type}
        }
    )
in
    ChangedType
PySpark : Runtime 1.2
spark.sql("desc DataTypes").show(truncate=False)
+--------------+-------------+-------+
|col_name      |data_type    |comment|
+--------------+-------------+-------+
|TextType      |string       |null   |
|NumberType    |double       |null   |
|DateType      |date         |null   |
|DateTimeType  |timestamp    |null   |
|LogicalType   |boolean      |null   |
|DecimalType   |decimal(34,6)|null   |
|Int64Type     |bigint       |null   |
|Int32Type     |int          |null   |
|Int16Type     |bigint       |null   |
|Int8Type      |smallint     |null   |
|CurrencyType  |smallint     |null   |
|PercentageType|decimal(34,6)|null   |
+--------------+-------------+-------+

思ったこと🙄

Dataflow Gen2 / Direct Lake mode のことを考慮すると、データ型使い分けの詳細を知ろうとするのは後回しで充分。

その他

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0