More than 5 years have passed since last update.

julia を使ってみる 2 (Pythonの利用と文字列）

Last updated at 2018-08-16Posted at 2018-08-16

juliaからpythonを利用する

julia を使ってみる 1 (インストールから文字列・整数・浮動小数点・虚数等）の続き

PyCallを利用するために

juliaからpythonを利用できるようにするためのパッケージであるPyCallを利用してみる

julia> using Pkg

julia> Pkg.add("PyCall")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
 Installed Conda ────────── v1.0.1
 Installed JSON ─────────── v0.19.0
 Installed MacroTools ───── v0.4.4
 Installed VersionParsing ─ v1.1.2
 Installed Compat ───────── v1.0.1
 Installed PyCall ───────── v1.17.1
  Updating `~/.julia/environments/v1.0/Project.toml`
  [438e738f] + PyCall v1.17.1
  Updating `~/.julia/environments/v1.0/Manifest.toml`
  [34da2185] + Compat v1.0.1
  [8f4d0f93] + Conda v1.0.1
  [682c06a0] + JSON v0.19.0
  [1914dd2f] + MacroTools v0.4.4
  [438e738f] + PyCall v1.17.1
  [81def892] + VersionParsing v1.1.2
  [2a0f44e3] + Base64 
  [ade2ca70] + Dates 
  [8bb1440f] + DelimitedFiles 
  [8ba89e20] + Distributed 
  [b77e0a4c] + InteractiveUtils 
  [76f85450] + LibGit2 
  [8f399da3] + Libdl 
  [37e2e46d] + LinearAlgebra 
  [56ddb016] + Logging 
  [d6f4376e] + Markdown 
  [a63ad114] + Mmap 
  [44cfe95a] + Pkg 
  [de0858da] + Printf 
  [3fa0cd96] + REPL 
  [9a3f8284] + Random 
  [ea8e919c] + SHA 
  [9e88b42a] + Serialization 
  [1a1011a3] + SharedArrays 
  [6462fe0b] + Sockets 
  [2f01184e] + SparseArrays 
  [10745b16] + Statistics 
  [8dfed614] + Test 
  [cf7118a7] + UUIDs 
  [4ec0a83e] + Unicode 
  Building Conda ─→ `~/.julia/packages/Conda/m7vem/deps/build.log`
  Building PyCall → `~/.julia/packages/PyCall/fiJ3o/deps/build.log`

condaを利用してPython環境が構築される

PyCallを利用してみる

動作しているPythonのバージョンを確認

julia> py"""
       import sys
       print(sys.version)"""
3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:44:39) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

Pythonとの比較のためにPyCallをインストールしたので、以下利用できる場合はpyを利用してjuliaとPython3の処理を比較する事にする

juliaの文字列再び

文字列を宣言してその型を調べる

julia> a = "ABCDEF"
"ABCDEF"

julia> typeof(a)
String

特に意味はないが、文字列の型を明記して宣言するなら

julia> a = "ABCDEF"::String
"ABCDEF"

部分文字列の利用

文字列から指定の場所の文字を取得できるが、1-basedなので、インデックスの始まりは1から

julia> a[0]
ERROR: BoundsError: attempt to access "ABCDEF"
  at index [0]
Stacktrace:
 [1] checkbounds at ./strings/basic.jl:193 [inlined]
 [2] codeunit at ./strings/string.jl:87 [inlined]
 [3] getindex(::String, ::Int64) at ./strings/string.jl:206
 [4] top-level scope at none:0

julia> a[1]
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> py""" # Pythonの場合
       a = "ABCDEF"
       print(a[0])
       """
A

スライスを利用して部分文字列を取得することもできる
range指定の方法は[開始位置:終了位置]

julia> a[1:2]
"AB"

julia> py""" # Pythonの場合
              a = "ABCDEF"
              print(a[0:2])
              """
AB

Pythonとは異なり、開始位置を省略して文字列の先頭を示したり、終了位置を省略するという事はできない

julia> py"""# Pythonの場合
       a = "ABCDEF"
       print(a[:2])
       print(a[3:])
       print(a[:])
       """
AB
DEF
ABCDEF

julia> a[:2]
'B': ASCII/Unicode U+0042 (category Lu: Letter, uppercase)

julia> a[1:2]
"AB"

julia> a[4:]
ERROR: syntax: missing last argument in "4:" range expression 

julia> a[4:length(a)]
"DEF"

julia> a[:]
"ABCDEF"

開始位置も終了位置も省略するというのはありだったりする　単に a となっている

文字列をスライスを利用して反転する
この場合はrange指定の方法が[開始位置:ステップ:終了位置]となる
Pythonとは異なるので注意が必要

julia> a[length(a):-1:1]
"FEDCBA"

julia> reverse(a)
"FEDCBA"

julia> py""" # Pythonの場合
              a = "ABCDEF"
              print(a[::-1])
              """
FEDCBA

(reverse関数を利用しても逆順に並べられる)

For ブロック

唐突だが、文字列処理の都合上ここで、forブロックを確認する
juliaのブロックは必ずendで閉じる必要がある
Pascalっぽいか？

julia> for i in 1:10
          println(i)
       end
1
2
... # 途中省略
9
10

julia> py""" # Pythonの場合
       for i in range(1, 11):
           print(i)
       """
1
2
... # 途中省略
9
10

文字列をforループで利用する

文字列を与えてループしてみる

julia> for c in a
           println(c)
       end
A
B
C
D
E
F

julia> py""" # Pythonの場合
       for c in a:
           print(c)
       """
A
B
C
D
E
F

このあたりはPythonと同じように文字列は文字の連続した配列として一文字づつ処理されているようだ

julia> for c in a
           println(typeof(c))
       end
Char
... # 途中省略
Char

この時cがCharとして処理されていることがわかる

`end`の別の利用方法

ブロックを閉じるendだが、キーワードとしてrangeの領域指定に利用するという方法がある
length(a)等とする代わりに endが利用できる

julia> a[3:end]
"CDEF"

行末の処理

行末に改行コード等が入っている場合にこれを取り除くのにrstrip関数が利用できる

julia> b = "ABCDEF\r\n"
"ABCDEF\r\n"

julia> rstrip(b)
"ABCDEF"

julia> length(b)
8

julia> length(rstrip(b))
6

julia> py""" # Pythonの場合
       b = "ABCDEF\r\n"
       print(b.rstrip())
       print(len(b.rstrip()))"""
ABCDEF
6

Delimiterでの文字列の分割

split関数を利用して文字列を分割する

julia> c = "This is a smaple sentence."
"This is a smaple sentence."

julia> split(c)
5-element Array{SubString{String},1}:
 "This"     
 "is"       
 "a"        
 "smaple"   
 "sentence."

julia> py""" # Pythonの場合
       c = "This is a smaple sentence."
       print(c.split())
       """
['This', 'is', 'a', 'smaple', 'sentence.']

文字列からなる配列の結合

join(文字列の配列, デリミタ)関数を利用する

julia> d = split(c)
5-element Array{SubString{String},1}:
 "This"     
 "is"       
 "a"        
 "smaple"   
 "sentence."

julia> join(d, " ")
"This is a smaple sentence."

julia> py"""# Pythonの場合
       c = "This is a smaple sentence."
       d = c.split()
       print(d)
       print(" ".join(d))
       """
['This', 'is', 'a', 'smaple', 'sentence.']
This is a smaple sentence.

変数の内挿(Interpolation)

文字列内に変数を埋め込んで文字列を作成できる
Pythonではf式を利用している

julia> println("a: $a, b:$b, c:$c")
a: ABCDEF, b:ABCDEF
, c:This is a smaple sentence.

julia> py""" # Pythonの場合
       a = "ABCDEF"
       b = "ABCDEF\r\n"
       c = "This is a smaple sentence."
       print(f"a: {a}, b:{b}, c:{c}")
       """
a: ABCDEF, b:ABCDEF
, c:This is a smaple sentence.

文字列内に特定の文字列が有るか調べる

findfirst, findnext, findlastから適宜必要なものを利用すれば良い
存在しない場合にはNothingとなる
Nothingと比較することでtrueまたはfalseを得る事ができる
if ではBooleanしか条件判定してくれないので、true 又は falseを得る事が後で必要になる

julia> findfirst("AB", a)　# 部分文字列が存在する場合
1:2

julia> typeof(findfirst("AB", a))
UnitRange{Int64}

julia> findfirst("AB", a) !== Nothing
true

julia> findfirst("Z", a)　# 部分文字列が存在しない場合

julia> typeof(findfirst("Z", a))
Nothing

julia> findfirst("Z", a) === Nothing
false

julia> py""" # Pythonの場合
       a = "ABCDEF"
       print("AB" in a)
       print("Z" in a)
       """
True
False

繰り返し文字列の生成

repeat関数と^関数の利用の2種類がある
なんでこんな事になっているのか
^は内部でrepeat関数を呼んでいるだけだった

julia> repeat("Wow!", 10)
"Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!"

julia> "Wow!"^10
"Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!"

julia> ^("Wow!", 10)
"Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!"

julia> py""" # Pythonの場合
       print("Wow!"*10)
       """
Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!Wow!

その他の関するについてはjuliaのマニュアルのStringの項目に多数列記されている

今回はここまで

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up