LoginSignup
4
2

More than 3 years have passed since last update.

Haskell の Text と ByteString

Last updated at Posted at 2020-07-05

Haskellには以下のような5つのタイプの文字列があります。なかなか覚えづらいのでメモ帳を作ります

  • String ([Char])
  • Text (Strict)
  • Text (Lazy)
  • ByteString (Strict)
  • ByteString (Lazy)

それぞれの文字列の関係は以下のような図で表すことができます。









String
[Char]
-> pack
<- unpack
strict Text -> encode
<- decode
strict ByteString
toChunks 
toStrict
fromChunks
fromStrict
toChunks 
toStrict
fromChunks
fromStrict
-> pack
<- unpack
lazy Text -> encode
<- decode
lazy ByteString

1. ByteString

bytestring: Fast, compact, strict and lazy byte strings with a list interface - Hackage

Strict ByteString - Data.ByteString

  • Strict ByteString型はbyte列の配列です。
  • 遅延評価は全く考慮されていません。
  • 一部が評価されることは、文字列の配列全体が評価されることを意味します。

Lazy ByteString - Data.ByteString.Lazy

  • Lazy ByteStringは遅延評価を行う。
  • 遅延評価の点で、Listに似ている。
  • サイズ64KBのChunkに保存されている。
  • Chunk単位で評価され、メモリの効率よく処理される。
  • Lazy ByteStringは、サイズ64KBのStrict ByteStringのListのようだ。

Word8
8-bit unsigned integer type, 0 - 255

import qualified Data.ByteString as S
import qualified Data.ByteString.Lazy as B

S.pack :: [Word8] -> ByteString
B.pack :: [Word8] -> ByteString

S.unpack :: ByteString -> [Word8]
B.unpack :: ByteString -> [Word8]

B.fromChunks :: [ByteString] -> ByteString  -- list of strict ByteString => lazy ByteString
B.toChunks :: ByteString -> [ByteString]    -- lazy ByteString => list of strict ByteString
B.fromStrict :: ByteString -> ByteString    -- Convert a strict ByteString into a lazy ByteString.
B.toStrict :: ByteString -> ByteString      -- Convert a lazy ByteString into a strict ByteString.
Prelude> import qualified Data.ByteString as S
Prelude S> import qualified Data.ByteString.Lazy as B

Prelude S B> let x = S.pack [97,98,99]
Prelude S B> x
"abc"
Prelude S B> S.unpack x
[97,98,99]

Prelude S B> let y = B.pack [97,98,99]
Prelude S B> y
"abc"
Prelude S B> B.unpack y
[97,98,99]

Prelude S B> B.toChunks y
["abc"]

Prelude S B> B.fromChunks [S.pack[97,98,99], S.pack[100,101,102], S.pack[103,104,105]]
"abcdefghi"

2. Text

text: An efficient packed Unicode text type. - Hackage
TextはUnicode textの実装

import qualified Data.Text as T
import qualified Data.Text.Lazy as L

T.pack :: String -> Text
L.pack :: String -> Text

T.unpack :: Text -> String
L.unpack :: Text -> String

T.length :: Text -> Int
L.length :: Text -> Int

T.map :: (Char -> Char) -> Text -> Text
L.map :: (Char -> Char) -> Text -> Text

T.intercalate :: Text -> [Text] -> Text
L.intercalate :: Text -> [Text] -> Text

L.fromChunks :: [Text] -> Text
L.toChunks :: Text -> [Text]
L.fromStrict :: Text -> Text
L.toStrict :: Text -> Text
Prelude> import qualified Data.Text as T

Prelude T> let message = T.pack "I am not angry. Not at all."
Prelude T> T.map (\c -> if c == '.' then '!' else c) message
"I am not angry! Not at all!"

Prelude T> T.length (T.pack "あいうえおabc")
8

Prelude T>  T.intercalate (T.pack "-") (map T.pack ["2020","04","08"])
"2020-04-08"

GHCの拡張 OverloadedStrings を使えば、リテラルを直接TextやBytestringに変換してくれます。例えば以下のように記述できます。

Prelude T> :set -XOverloadedStrings
Prelude T>  T.intercalate "-"  ["2020","04","08"]
"2020-04-08"
4
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
2