重複単語をスマートに処理！Setで処理を速く！

Last updated at 2025-04-05Posted at 2025-04-04

paizaの重複単語の問題を解いた！直感でincludes()を使って書いたけど、データ量が大きくなったらSetの方が確実に計算が速くて良さそう！

問題概要

入力されたスペース区切りの単語を、1行ずつ出力。ただし、同じ単語が2回目以降に出てきたら already_been に置き換える。

入力:

red green blue blue green blue

出力:

red  
green  
blue  
already_been  
already_been  
already_been

例: includes() を使うと O(N²) の悲劇

なんとなくで初めに書いたやつ

const words = input.split(' ');
const seen = []; // 既出単語を保存する配列

words.forEach(word => {
    if (!seen.includes(word)) {  // O(N) の検索
        console.log(word);
        seen.push(word);  // 配列に追加
    } else {
        console.log("already_been");
    }
});

🔴 問題点:
• includes() は O(N) で検索 → ループ内で使うと O(N²) に！
• N が 10万のとき、最悪 10億回の比較…

例: Set で O(1) 検索 & O(N) ループ

const words = input.split(' ');
const seen = new Set(); // 既出単語を保存する Set

words.forEach(word => {
    if (seen.has(word)) {  // O(1) の検索
        console.log("already_been");
    } else {
        console.log(word);
        seen.add(word);  // Set に追加
    }
});

✅ 改善点:
• Set の has() は O(1) だから、ループ全体で O(N) で完了！
• 大量データ（N ≒ 10万）でも爆速処理！

Set が便利な理由まとめ

✔ Set の .has() は O(1) で超高速
　　→（配列の includes() は O(N)）
✔ 重複なしのデータ管理に最適（余計なデータを増やさない）
✔ パフォーマンス重視なら Set 一択！（特に大規模データ）

→ includes() は小規模ならOK、大量データなら Set で最適化しよう！

🔗僕の失敗談と解決話！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up