知っていて当たり前-06 文字,文字列を対象とする関数
1. 文字の種別 isxxx()
以下では,ドット記法 isxxx.()
を使って用例を示したが,isxxx('a')
のようにスカラー引数で使うことも多い。
ch = ['0', 'a', 'A', 'α', 'Γ', '\n', ';', ' ']
isdigit.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
1 0 0 0 0 0 0 0
isletter.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
0 1 1 1 1 0 0 0
iscntrl.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
0 0 0 0 0 1 0 0
isascii.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
1 1 1 0 0 1 1 1
islowercase.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
0 1 0 1 0 0 0 0
isuppercase.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
0 0 1 0 1 0 0 0
isnumeric.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
1 0 0 0 0 0 0 0
isprint.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
1 1 1 1 1 0 1 1
ispunct.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
0 0 0 0 0 0 1 0
isspace.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
0 0 0 0 0 1 0 1
isxdigit.(ch)'
1×8 adjoint(::BitVector) with eltype Bool:
1 1 1 0 0 0 0 0
2. 同じかどうか isequal()
, ==
isequal("a", "A"), isequal("a")("a"), "abc" == "abc"
(false, true, true)
3. 辞書順での比較 cmp()
, isless()
cmp("abc", "ab") # 第1引数の文字列が辞書順で後
cmp("abc", "abc") # 同じ
cmp("abc", "abcd") # 第1引数の文字列が辞書順で前
-1
isless("a", "b"), isless("xyz", "abc") # 第1引数の文字列が辞書順で前なら true
(true, false)
4. 数(コード)を文字に変換 Char()
Char(65), Char(0x672c), Char(26412)
('A', '本', '本')
5. 文字を数(コード)に変換 Int()
Int('A'), Int('本'), string(Int('本'), base=16), Float64('A')
(65, 26412, "672c", 65.0)
'A' + 5, 'A' + 32, 'a' - 'A'
('F', 'a', 32)
6. 数を文字列に変換 string()
, String()
string(123), string(123.456)
("123", "123.456")
String(0x41:0x5A)
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
7. 文字列を数に変換 parse()
parse(Int64, "12345") # 12345 文字列を10進数とみて整数化
12345
parse(Int64, "12345", base=16) # 文字列を16進数とみて整数化
74565
parse(Float64, "123.45") # 文字列を実数に変換
123.45
8. 整数を一桁ずつ Char 型にする Char[...]
Char[string(123)...]
3-element Vector{Char}:
'1': ASCII/Unicode U+0031 (category Nd: Number, decimal digit)
'2': ASCII/Unicode U+0032 (category Nd: Number, decimal digit)
'3': ASCII/Unicode U+0033 (category Nd: Number, decimal digit)
(Int, [string(123)...])
(Int64, ['1', '2', '3'])
9. 整数を一桁ずつ String 型にする split()
split(string(123), "")
3-element Vector{SubString{String}}:
"1"
"2"
"3"
10. 文字列の一桁ずつを整数化 parse()
s = "0123456789";
parse.(Int,[s...])'
1×10 adjoint(::Vector{Int64}) with eltype Int64:
0 1 2 3 4 5 6 7 8 9
11. 文字列の逆転 reverse()
reverse("abxdef"), reverse("able was i ere i saw elba")
("fedxba", "able was i ere i saw elba")
12. 文字列ベクトルの逆転 reverse()
b = ["1", "2", "3", "4", "5", "6", "7", "8"]
reverse(b, 3)
8-element Vector{String}:
"1"
"2"
"8"
"7"
"6"
"5"
"4"
"3"
reverse(b, 3, 6)
8-element Vector{String}:
"1"
"2"
"6"
"5"
"4"
"3"
"7"
"8"
13. 文字列を 1 文字ずつに分解する
c = Char["abc日本語123"...]
9-element Vector{Char}:
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
'日': Unicode U+65E5 (category Lo: Letter, other)
'本': Unicode U+672C (category Lo: Letter, other)
'語': Unicode U+8A9E (category Lo: Letter, other)
'1': ASCII/Unicode U+0031 (category Nd: Number, decimal digit)
'2': ASCII/Unicode U+0032 (category Nd: Number, decimal digit)
'3': ASCII/Unicode U+0033 (category Nd: Number, decimal digit)
14. 文字をつなげて文字列にする join()
join(c), join(c, "-")
("abc日本語123", "a-b-c-日-本-語-1-2-3")
15. 文字列の取り出し(スライス) :
15.1. 一つの文字列の場合
str = "abcdefg";
str[2]
'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
str[2:2]
"b"
str[2:3]
"bc"
str[1:2:end]
"aceg"
str[begin+2:end-1]
"cdef"
str[end-2:end]
"efg"
str[end:-1:begin]
"gfedcba"
reverse(str)
"gfedcba"
str[end:-2:begin]
"geca"
15.2. 文字列ベクトルの場合
a = ["boo", "bar", "baz", "xxx", "yyy", "zzz"]
6-element Vector{String}:
"boo"
"bar"
"baz"
"xxx"
"yyy"
"zzz"
a[begin]
"boo"
a[end]
"zzz"
a[begin+1]
"bar"
a[begin:2:end]
3-element Vector{String}:
"boo"
"baz"
"yyy"
a[end:-2:begin]
3-element Vector{String}:
"zzz"
"xxx"
"bar"
a[end:begin]
String[]
reverse(a)
6-element Vector{String}:
"zzz"
"yyy"
"xxx"
"baz"
"bar"
"boo"
b = ["1", "2", "3", "4", "5", "6", "7", "8"]
reverse(b, 3)
8-element Vector{String}:
"1"
"2"
"8"
"7"
"6"
"5"
"4"
"3"
reverse(b, 3, 6)
8-element Vector{String}:
"1"
"2"
"6"
"5"
"4"
"3"
"7"
"8"
SubString("abcdefg", 2, 3)
"bc"
SubString("abcdefg", 2:4)
"bcd"
SubString("abcdefg", 2)
"bcdefg"
16. 文字種の変換 lowercase()
, uppercase()
, titlecase()
lowercase("ABC 123")
"abc 123"
uppercase("qwerty 678")
"QWERTY 678"
titlecase("the JULIA programming language")
"The Julia Programming Language"
17. 文字列の連結 *
, string()
, join()
, "...$var..."
"abcde" * "12345"
"abcde12345"
["a", "b"] .* ["1", "2"]
2-element Vector{String}:
"a1"
"b2"
*("123", "ABC", " ", "XYZ")
"123ABC XYZ"
"X" .* string.(1:5)
5-element Vector{String}:
"X1"
"X2"
"X3"
"X4"
"X5"
greet = "Hello";
whom = "world";
string(greet, ", ", whom, ".\n")
"Hello, world.\n"
join([greet, ", ", whom, ".\n"])
"Hello, world.\n"
greet * ", " * whom * ".\n"
"Hello, world.\n"
*(greet, ", ", whom, ".\n")
"Hello, world.\n"
"$greet, $whom.\n"
"Hello, world.\n"
18. 文字列の繰り返し ^
"abc." ^ 5
"abc.abc.abc.abc.abc."
19. 部分文字列の存在
findnext("Lang", "JuliaLang", 1) # 6:9
6:9
isnothing(findnext("Lang", "JuliaLang", 1)) # false
false
findnext("Python", "JuliaLang", 1) # nothing
isnothing(findnext("Python", "JuliaLang", 1)) # true
true
20. 文字列の追加 append!()
"12345" を追加するときと ["12345"] を追加するときでは全く違う結果になる。
b = []
append!(b, "12345") # 5 個の Char を追加
b
5-element Vector{Any}:
'1': ASCII/Unicode U+0031 (category Nd: Number, decimal digit)
'2': ASCII/Unicode U+0032 (category Nd: Number, decimal digit)
'3': ASCII/Unicode U+0033 (category Nd: Number, decimal digit)
'4': ASCII/Unicode U+0034 (category Nd: Number, decimal digit)
'5': ASCII/Unicode U+0035 (category Nd: Number, decimal digit)
b = []
append!(b, ["12345"]) # 1 個の String を追加
b
1-element Vector{Any}:
"12345"
b = []
append!(b, ["abc", "xyz"]) # 2 個の String を追加
b
2-element Vector{Any}:
"abc"
"xyz"