AtCoder

ウェザーニューズコンテストのメモ

問題 https://wn2017_1.contest.atcoder.jp/tasks/wn2017_1_a
メモ表 https://qiita.com/cielavenir/items/0277f723da1b5ab7da17
メモ裏 https://qiita.com/cielavenir/items/31edc282a43ca4305fcb

データセット

入力ファイル(パス名は適宜変更して下さい)

encode
/opt/wn.arc
/opt/bin_4obs
64
OR_ABI-L1b-RadC-M3C01_G16_s20172101802185_e20172101804558_c20172101805000.bin 30000000
OR_ABI-L1b-RadC-M3C01_G16_s20172101807185_e20172101809558_c20172101810002.bin 30000000
OR_ABI-L1b-RadC-M3C01_G16_s20172101812185_e20172101814558_c20172101815003.bin 30000000
OR_ABI-L1b-RadC-M3C01_G16_s20172101817185_e20172101819558_c20172101820003.bin 30000000
OR_ABI-L1b-RadC-M3C02_G16_s20172101802185_e20172101804558_c20172101804596.bin 120000000
OR_ABI-L1b-RadC-M3C02_G16_s20172101807185_e20172101809558_c20172101809595.bin 120000000
OR_ABI-L1b-RadC-M3C02_G16_s20172101812185_e20172101814558_c20172101814592.bin 120000000
OR_ABI-L1b-RadC-M3C02_G16_s20172101817185_e20172101819558_c20172101819593.bin 120000000
OR_ABI-L1b-RadC-M3C03_G16_s20172101802185_e20172101804558_c20172101805000.bin 30000000
OR_ABI-L1b-RadC-M3C03_G16_s20172101807185_e20172101809558_c20172101810000.bin 30000000
OR_ABI-L1b-RadC-M3C03_G16_s20172101812185_e20172101814558_c20172101815002.bin 30000000
OR_ABI-L1b-RadC-M3C03_G16_s20172101817185_e20172101819558_c20172101820000.bin 30000000
OR_ABI-L1b-RadC-M3C04_G16_s20172101802185_e20172101804558_c20172101804599.bin 7500000
OR_ABI-L1b-RadC-M3C04_G16_s20172101807185_e20172101809558_c20172101810000.bin 7500000
OR_ABI-L1b-RadC-M3C04_G16_s20172101812185_e20172101814558_c20172101814599.bin 7500000
OR_ABI-L1b-RadC-M3C04_G16_s20172101817185_e20172101819558_c20172101819599.bin 7500000
OR_ABI-L1b-RadC-M3C05_G16_s20172101802185_e20172101804558_c20172101805007.bin 30000000
OR_ABI-L1b-RadC-M3C05_G16_s20172101807185_e20172101809558_c20172101810007.bin 30000000
OR_ABI-L1b-RadC-M3C05_G16_s20172101812185_e20172101814558_c20172101815003.bin 30000000
OR_ABI-L1b-RadC-M3C05_G16_s20172101817185_e20172101819558_c20172101820006.bin 30000000
OR_ABI-L1b-RadC-M3C06_G16_s20172101802185_e20172101804564_c20172101805009.bin 7500000
OR_ABI-L1b-RadC-M3C06_G16_s20172101807185_e20172101809564_c20172101810008.bin 7500000
OR_ABI-L1b-RadC-M3C06_G16_s20172101812185_e20172101814564_c20172101814596.bin 7500000
OR_ABI-L1b-RadC-M3C06_G16_s20172101817185_e20172101819564_c20172101819596.bin 7500000
OR_ABI-L1b-RadC-M3C07_G16_s20172101802185_e20172101804569_c20172101805005.bin 7500000
OR_ABI-L1b-RadC-M3C07_G16_s20172101807185_e20172101809569_c20172101810007.bin 7500000
OR_ABI-L1b-RadC-M3C07_G16_s20172101812185_e20172101814569_c20172101814596.bin 7500000
OR_ABI-L1b-RadC-M3C07_G16_s20172101817185_e20172101819569_c20172101820006.bin 7500000
OR_ABI-L1b-RadC-M3C08_G16_s20172101802185_e20172101804558_c20172101804596.bin 7500000
OR_ABI-L1b-RadC-M3C08_G16_s20172101807185_e20172101809558_c20172101809598.bin 7500000
OR_ABI-L1b-RadC-M3C08_G16_s20172101812185_e20172101814558_c20172101814596.bin 7500000
OR_ABI-L1b-RadC-M3C08_G16_s20172101817185_e20172101819558_c20172101819597.bin 7500000
OR_ABI-L1b-RadC-M3C09_G16_s20172101802185_e20172101804564_c20172101804596.bin 7500000
OR_ABI-L1b-RadC-M3C09_G16_s20172101807185_e20172101809564_c20172101809599.bin 7500000
OR_ABI-L1b-RadC-M3C09_G16_s20172101812185_e20172101814564_c20172101814596.bin 7500000
OR_ABI-L1b-RadC-M3C09_G16_s20172101817185_e20172101819564_c20172101819597.bin 7500000
OR_ABI-L1b-RadC-M3C10_G16_s20172101802185_e20172101804569_c20172101805006.bin 7500000
OR_ABI-L1b-RadC-M3C10_G16_s20172101807185_e20172101809569_c20172101810008.bin 7500000
OR_ABI-L1b-RadC-M3C10_G16_s20172101812185_e20172101814569_c20172101815005.bin 7500000
OR_ABI-L1b-RadC-M3C10_G16_s20172101817185_e20172101819569_c20172101820005.bin 7500000
OR_ABI-L1b-RadC-M3C11_G16_s20172101802185_e20172101804558_c20172101805006.bin 7500000
OR_ABI-L1b-RadC-M3C11_G16_s20172101807185_e20172101809558_c20172101810008.bin 7500000
OR_ABI-L1b-RadC-M3C11_G16_s20172101812185_e20172101814558_c20172101814596.bin 7500000
OR_ABI-L1b-RadC-M3C11_G16_s20172101817185_e20172101819558_c20172101820005.bin 7500000
OR_ABI-L1b-RadC-M3C12_G16_s20172101802185_e20172101804564_c20172101805005.bin 7500000
OR_ABI-L1b-RadC-M3C12_G16_s20172101807185_e20172101809564_c20172101810007.bin 7500000
OR_ABI-L1b-RadC-M3C12_G16_s20172101812185_e20172101814564_c20172101814597.bin 7500000
OR_ABI-L1b-RadC-M3C12_G16_s20172101817185_e20172101819564_c20172101820006.bin 7500000
OR_ABI-L1b-RadC-M3C13_G16_s20172101802185_e20172101804569_c20172101804596.bin 7500000
OR_ABI-L1b-RadC-M3C13_G16_s20172101807185_e20172101809569_c20172101809598.bin 7500000
OR_ABI-L1b-RadC-M3C13_G16_s20172101812185_e20172101814569_c20172101814598.bin 7500000
OR_ABI-L1b-RadC-M3C13_G16_s20172101817185_e20172101819569_c20172101819597.bin 7500000
OR_ABI-L1b-RadC-M3C14_G16_s20172101802185_e20172101804558_c20172101804596.bin 7500000
OR_ABI-L1b-RadC-M3C14_G16_s20172101807185_e20172101809558_c20172101809597.bin 7500000
OR_ABI-L1b-RadC-M3C14_G16_s20172101812185_e20172101814558_c20172101814598.bin 7500000
OR_ABI-L1b-RadC-M3C14_G16_s20172101817185_e20172101819558_c20172101819598.bin 7500000
OR_ABI-L1b-RadC-M3C15_G16_s20172101802185_e20172101804564_c20172101805001.bin 7500000
OR_ABI-L1b-RadC-M3C15_G16_s20172101807185_e20172101809564_c20172101810000.bin 7500000
OR_ABI-L1b-RadC-M3C15_G16_s20172101812185_e20172101814564_c20172101814599.bin 7500000
OR_ABI-L1b-RadC-M3C15_G16_s20172101817185_e20172101819564_c20172101820001.bin 7500000
OR_ABI-L1b-RadC-M3C16_G16_s20172101802185_e20172101804569_c20172101805007.bin 7500000
OR_ABI-L1b-RadC-M3C16_G16_s20172101807185_e20172101809569_c20172101809599.bin 7500000
OR_ABI-L1b-RadC-M3C16_G16_s20172101812185_e20172101814569_c20172101814597.bin 7500000
OR_ABI-L1b-RadC-M3C16_G16_s20172101817185_e20172101819569_c20172101819597.bin 7500000

圧縮率等

実験中に前提条件いじったりしてるので概算でお願いします。あと、展開時間は測ってません(レギュレーションとして、圧縮展開採点合わせて300秒制限です。なので圧縮だけで180秒超えていると厳しい)。

method size time
uncompressed 1200000000 ---
baseline (bz2) 532905020 ---
bzip2 -c1 559792262 85.447
bzip2 -c6 535835878 85.755
bzip2 -c9 532887894 88.136
7z -tBZip2 -mmt=off -mx=5 532917510 159.40
7z -tBZip2 -mmt=off -mx=7 532883322 420.02
7z -mmt=off -m0=BZip2:x=5 532875514 158.21
7z -mmt=off -m0=LZMA:x=3 630291168 152.76
7z -mmt=off -m0=Deflate:x=7 709499946 250.12
7z -mmt=off -m0=PPMd:mem=31:o=3 496017916 106.06
7z -mmt=off -m0=PPMd:mem=31:o=4 488523113 222.53
lzma -c1 632722797 160.36
lzma -c2 626055488 287.42
lzma -c3 621227786 530.83
bcm -b100k 511132297 109.80
ppmd -m256 -o3 498540833 131.09
ppmd -m256 -o3 (native/gcc) 498540833 132.23
ppmd -m256 -o3 (native/clang) 498540833 133.30
ppmd -m256 -o3 -s1 (native/clang) 495008829 115.33
ppmd -m256 -o3 -s1 (native/clang/sort) 494669618 118.40
ppmd -m256 -o4 -s1 (native/clang) 490291685 295.62
ppmd -v7 -m256 -o4 -s1 491369463 396.54 / 1.5
ppmd -v8 -m256 -o4 -s1 491411842 391.94 / 1.5
yz1 745193835 72.21
rar -m3 (3.60) 550522701 137.52
rar -m5 -s (3.60) 549115452 195.22
DGCA (wine) 495266480 255

bzip2、意外と強いんですね。lzmaがあまり奮闘していないのが個人的に残念。
あと、ppmdが群を抜いて強い。bcmも健闘している。
それと、(D)GCAはぜひAtCoder上で試してみたかった。技術的にも不可能だけど。。

  • 解答サンプル(tar.bz2)
#!/usr/bin/ruby
mode=gets.chomp
arcname=gets.chomp
dir=gets.chomp
Dir.chdir(dir)

ARC='tar -c --blocking-factor=1 --files-from=/dev/stdin --format=ustar | bzip2 -c9 > %s'
if mode=='encode'
    IO.popen(ARC%arcname,'w'){|io|
        io.puts gets.to_i.times.map{
            a,b=gets.split
            [a,b.to_i]
        }.map{|x,y|
            x
        }.sort
    }
elsif mode=='decode'
    system('bzip2 -cd < %s | tar -x'%arcname)
end

圧縮率等(2)

AtCoderでの制限時間が600秒になったので多少むちゃできるようになった。
てか、bcmの圧縮レベルを上げるだけで12MBの差が出た。

method size time (/1.5)
bcm -b128 (sort) 474959308 371.17
bcm -b128 474911292 362.80
bcm -b256 (sortr) 474107627 365.24
bcm -b256 (sort) 474497071 363.00
bcm -b256 474833664 368.10
mcm -m9 476349526 899.69
F=23 ppmd -m256 -o4 -s1 (native/clang/sort) 486940310 407.33
F=23 ppmd -m256 -o4 -s1 (native/clang/sort_namer) 486663763 401.27
F=23 ppmd -m256 -o5 -s1 (native/clang/sort) 487432470 579.34
F=23 ppmd -m256 -o4 -s1 -v7 488634495 410.59
F=23 ppmd -m256 -o4 -s1 -v8 488880636 425.90
littlearc delta 2 / bcm -b256 486536728 413.77
littlearc / bcm -b256 474823956 392.78
littlearc / bcm -b256 (sortr) 474098088 390.45
littlearc / bcm -b256 (sort) 474487554 365.42