More than 5 years have passed since last update.

よく見る文字列の扱い方

プログラミング

Posted at 2020-01-06

プログラミング中によく見かける文字列とその扱い方。

文字列の末尾が = で終わっている場合は Base64 エンコードされた文字列なのでデコードする。

$ echo aGVsbG8sIHdvcmxkCg== | base64 -D
hello, world

ey で始まる文字列は Base64 エンコードされた JSON なのでデコードする。JWT などで頻出。

$ echo eyJuYW1lIjogImdyb2hpcm8hIn0K | base64 -d | jq .
{
  "name": "grohiro!"
}

文字列が16進数っぽい([0-9a-f]) ときは xxd でバイナリ変換してみる。

$ echo 68656c6c6f2c20776f726c64210a | xxd -r -p > hatena.txt
$ cat hatena.txt
hello, world!

意味不明なファイルが送られてきたときは先頭の文字列を調べる。

$ hexdump -C hatena.dat
00000000  50 4b 03 04 0a 00 00 00  00 00 17 7f 65 4f c0 df  |PK..........eO..|
00000010  31 b6 0e 00 00 00 0e 00  00 00 09 00 1c 00 68 65  |1.............he|
00000020  6c 6c 6f 2e 74 78 74 55  54 09 00 03 2e 1d c1 5d  |llo.txtUT......]|
....

先頭文字列が PK で ZIP ファイルということがわかるので unzip してみる。

$ unzip hatena.dat
Archive:  hatena.dat
 extracting: hello.txt
$ cat hello.txt
hello, world!

先頭文字列の例（必ずしも先頭ではなく1〜2行目付近にある）

PK ZIP ファイル
PNG PNG ファイル
JFIF JPEG ファイル

16進数の 0d 0a は改行。

$ echo -n -e "\r"  | hexdump
0000000 0d
0000001

$ echo -n -e "\n"  | hexdump
0000000 0a
0000001

$ echo -n -e "\r\n"  | hexdump
0000000 0d 0a
0000002

（WAVE_DASH とか FULLWIDTH TILDE の 0xなんちゃらについても追記したい）

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up