1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

居眠り防止!居眠りしてたらずんだもんに起こされるアプリを作成してみた

Posted at

1. 眠気を覚ましたい!

勉強中眠いときに、眠気を覚ましたいなと思うことがよくあります。

学生の頃、授業で居眠り中に声をかけられたとき、体がビクッとして1番目が覚めた気がしたので、それを再現するようなアプリを作ってみました。

2. できたもの

機能は単純で、設定した秒数目を閉じていると、ずんだもんに声をかけられます。

目が開いていると、左上が監視中となります。

目を閉じると、検知されて左上に秒数が表示されます。
下にある判定時間の秒数(画像だと3秒)を超えるとずんだもんから声をかけられます。

3. 事前準備

3.1. VOICEVOXエンジンをダウンロード

下記URLからダウンロードします。

私はMacで実装しているため、

  • OS:Mac
  • 対応モード:CPU(Apple)
  • パッケージ:インストーラー

を選択してダウンロードしました。

その後、インストーラーを起動してドラッグアンドドロップしてください。

スクリーンショット 2025-12-02 21.46.42.png

3.2. Docker Desktopをインストール

下記URLからダウンロードします。

4. 実装

4.1. 全体コード

全体コードはこちらです
<!DOCTYPE html>
<html lang="ja">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>居眠り防止ずんだもんアラート</title>
    <script src="https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/face_mesh.js" crossorigin="anonymous"></script>
    <script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
    <style>
        @import url('https://fonts.googleapis.com/css2?family=Zen+Maru+Gothic:wght@400;500;700&display=swap');
        
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        
        body {
            font-family: 'Zen Maru Gothic', sans-serif;
            background: linear-gradient(180deg, #e8f5e3 0%, #d4edda 100%);
            min-height: 100vh;
            color: #2d4a2d;
        }
        
        .container {
            max-width: 420px;
            margin: 0 auto;
            padding: 24px 16px;
            min-height: 100vh;
        }
        
        /* Header */
        header {
            text-align: center;
            margin-bottom: 24px;
        }
        
        h1 {
            font-size: 1.5rem;
            font-weight: 700;
            color: #3d6b3d;
            margin-bottom: 4px;
        }
        
        .subtitle {
            font-size: 0.875rem;
            color: #5a8a5a;
        }
        
        /* Card */
        .card {
            background: #fff;
            border-radius: 20px;
            padding: 20px;
            margin-bottom: 16px;
            box-shadow: 0 2px 12px rgba(61, 107, 61, 0.1);
        }
        
        /* Video area */
        .video-area {
            position: relative;
            border-radius: 16px;
            overflow: hidden;
            background: #f0f7ef;
            aspect-ratio: 4/3;
        }
        
        #webcam {
            width: 100%;
            height: 100%;
            object-fit: cover;
            transform: scaleX(-1);
        }
        
        #canvas {
            position: absolute;
            top: 0;
            left: 0;
            width: 100%;
            height: 100%;
            pointer-events: none;
        }
        
        /* Status chip */
        .status-chip {
            position: absolute;
            top: 12px;
            left: 12px;
            display: flex;
            align-items: center;
            gap: 8px;
            background: #fff;
            padding: 8px 14px;
            border-radius: 20px;
            font-size: 0.813rem;
            font-weight: 500;
            box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
        }
        
        .status-dot {
            width: 10px;
            height: 10px;
            border-radius: 50%;
            background: #7ccd62;
        }
        
        .status-dot.danger {
            background: #e85d5d;
            animation: pulse 0.8s infinite;
        }
        
        @keyframes pulse {
            0%, 100% { transform: scale(1); opacity: 1; }
            50% { transform: scale(1.2); opacity: 0.7; }
        }
        
        /* EAR display */
        .ear-chip {
            position: absolute;
            bottom: 12px;
            right: 12px;
            background: rgba(255, 255, 255, 0.9);
            padding: 6px 12px;
            border-radius: 12px;
            font-size: 0.75rem;
            color: #5a8a5a;
            font-weight: 500;
        }
        
        /* Alert overlay */
        .alert-overlay {
            position: absolute;
            inset: 0;
            background: rgba(232, 93, 93, 0.2);
            border-radius: 16px;
            opacity: 0;
            transition: opacity 0.3s;
            pointer-events: none;
        }
        
        .alert-overlay.active {
            opacity: 1;
        }
        
        /* Permission screen */
        .permission-screen {
            display: flex;
            flex-direction: column;
            align-items: center;
            justify-content: center;
            height: 100%;
            text-align: center;
            padding: 32px;
        }
        
        .permission-icon {
            width: 72px;
            height: 72px;
            background: linear-gradient(135deg, #7ccd62 0%, #a8e063 100%);
            border-radius: 50%;
            display: flex;
            align-items: center;
            justify-content: center;
            margin-bottom: 16px;
        }
        
        .permission-icon svg {
            width: 32px;
            height: 32px;
            stroke: #fff;
        }
        
        .permission-screen p {
            color: #5a8a5a;
            font-size: 0.875rem;
            margin-bottom: 20px;
            line-height: 1.6;
        }
        
        /* Buttons */
        .btn {
            font-family: 'Zen Maru Gothic', sans-serif;
            font-size: 0.938rem;
            font-weight: 700;
            padding: 14px 28px;
            border: none;
            border-radius: 14px;
            cursor: pointer;
            transition: all 0.2s;
        }
        
        .btn-primary {
            background: linear-gradient(135deg, #7ccd62 0%, #6abb52 100%);
            color: #fff;
            box-shadow: 0 4px 12px rgba(124, 205, 98, 0.3);
        }
        
        .btn-primary:hover {
            transform: translateY(-1px);
            box-shadow: 0 6px 16px rgba(124, 205, 98, 0.4);
        }
        
        .btn-primary:active {
            transform: translateY(0);
        }
        
        .btn-secondary {
            background: #f0f7ef;
            color: #3d6b3d;
        }
        
        .btn-secondary:hover {
            background: #e5f0e3;
        }
        
        /* Controls */
        .controls {
            display: grid;
            grid-template-columns: 1fr 1fr;
            gap: 12px;
            margin-top: 16px;
        }
        
        /* Settings */
        .setting-item {
            margin-bottom: 20px;
        }
        
        .setting-item:last-child {
            margin-bottom: 0;
        }
        
        .setting-header {
            display: flex;
            justify-content: space-between;
            align-items: center;
            margin-bottom: 10px;
        }
        
        .setting-label {
            font-size: 0.875rem;
            font-weight: 500;
            color: #3d6b3d;
        }
        
        .setting-value {
            font-size: 0.875rem;
            font-weight: 700;
            color: #7ccd62;
        }
        
        /* Range slider */
        input[type="range"] {
            width: 100%;
            height: 8px;
            -webkit-appearance: none;
            appearance: none;
            background: #e5f0e3;
            border-radius: 4px;
            outline: none;
        }
        
        input[type="range"]::-webkit-slider-thumb {
            -webkit-appearance: none;
            appearance: none;
            width: 24px;
            height: 24px;
            background: linear-gradient(135deg, #7ccd62 0%, #6abb52 100%);
            border-radius: 50%;
            cursor: pointer;
            box-shadow: 0 2px 8px rgba(124, 205, 98, 0.4);
            transition: transform 0.2s;
        }
        
        input[type="range"]::-webkit-slider-thumb:hover {
            transform: scale(1.1);
        }
        
        /* Connection status */
        .connection {
            display: flex;
            align-items: center;
            gap: 10px;
            padding: 12px 16px;
            background: #f8fdf7;
            border-radius: 12px;
            font-size: 0.813rem;
            border: 1px solid #e5f0e3;
        }
        
        .connection-dot {
            width: 8px;
            height: 8px;
            border-radius: 50%;
            background: #ccc;
        }
        
        .connection.connected .connection-dot {
            background: #7ccd62;
        }
        
        .connection.connected {
            background: #f0fdf0;
            border-color: #c8e6c9;
        }
        
        .connection.disconnected .connection-dot {
            background: #e85d5d;
        }
        
        .connection.disconnected {
            background: #fef5f5;
            border-color: #fdd;
        }
        
        .connection-text {
            color: #5a8a5a;
            font-weight: 500;
        }
        
        /* Utility */
        .hidden {
            display: none !important;
        }
        
        .main-content {
            display: none;
        }
        
        .main-content.active {
            display: block;
        }
    </style>
</head>
<body>
    <div class="container">
        <header>
            <h1>居眠り防止ずんだもんアラート</h1>
            <p class="subtitle">居眠りを検出して起こします</p>
        </header>
        
        <div class="card">
            <div class="video-area" id="videoArea">
                <div class="alert-overlay" id="alertOverlay"></div>
                
                <div class="permission-screen" id="permissionScreen">
                    <div class="permission-icon">
                        <svg viewBox="0 0 24 24" fill="none" stroke-width="2">
                            <path d="M23 19a2 2 0 0 1-2 2H3a2 2 0 0 1-2-2V8a2 2 0 0 1 2-2h4l2-3h6l2 3h4a2 2 0 0 1 2 2z"/>
                            <circle cx="12" cy="13" r="4"/>
                        </svg>
                    </div>
                    <p>カメラを使用して<br>目の状態を監視します</p>
                    <button class="btn btn-primary" id="startBtn">開始する</button>
                </div>
                
                <video id="webcam" class="hidden" autoplay playsinline></video>
                <canvas id="canvas" class="hidden"></canvas>
                
                <div class="status-chip hidden" id="statusChip">
                    <div class="status-dot" id="statusDot"></div>
                    <span id="statusText">監視中</span>
                </div>
                
                <div class="ear-chip hidden" id="earChip">EAR: 0.000</div>
            </div>
            
            <div class="controls hidden" id="controls">
                <button class="btn btn-secondary" id="stopBtn">停止</button>
                <button class="btn btn-primary" id="testBtn">テスト再生</button>
            </div>
        </div>
        
        <div class="card main-content" id="mainContent">
            <div class="setting-item">
                <div class="setting-header">
                    <span class="setting-label">検出感度</span>
                    <span class="setting-value" id="sensitivityValue">0.22</span>
                </div>
                <input type="range" id="sensitivitySlider" min="0.1" max="0.4" step="0.02" value="0.22">
            </div>
            
            <div class="setting-item">
                <div class="setting-header">
                    <span class="setting-label">判定時間</span>
                    <span class="setting-value" id="timeValue">3.0秒</span>
                </div>
                <input type="range" id="timeSlider" min="1" max="10" step="0.5" value="3">
            </div>
            
            <div class="setting-item">
                <div class="connection" id="connectionStatus">
                    <div class="connection-dot"></div>
                    <span class="connection-text">確認中...</span>
                </div>
            </div>
        </div>
    </div>
    
    <script>
        const zundamonPhrases = [
            "起きるのだ!",
            "寝ちゃダメなのだ!",
            "おーい!起きるのだ!",
            "ずんだもちパワーで起こすのだ!",
            "目を開けるのだ!",
            "寝るのは夜だけなのだ!",
            "シャキッとするのだ!"
        ];
        
        let isMonitoring = false;
        let eyeClosedStartTime = null;
        let isAlarmPlaying = false;
        let faceMesh = null;
        let camera = null;
        let lastEAR = 1;
        
        let EAR_THRESHOLD = 0.22;
        let CLOSED_DURATION_MS = 3000;
        
        const VOICEVOX_URL = 'http://localhost:50021';
        const ZUNDAMON_SPEAKER_ID = 3;
        let currentAudio = null;
        
        const videoElement = document.getElementById('webcam');
        const canvasElement = document.getElementById('canvas');
        const canvasCtx = canvasElement.getContext('2d');
        const permissionScreen = document.getElementById('permissionScreen');
        const mainContent = document.getElementById('mainContent');
        const controls = document.getElementById('controls');
        const startBtn = document.getElementById('startBtn');
        const stopBtn = document.getElementById('stopBtn');
        const testBtn = document.getElementById('testBtn');
        const statusChip = document.getElementById('statusChip');
        const statusDot = document.getElementById('statusDot');
        const statusText = document.getElementById('statusText');
        const earChip = document.getElementById('earChip');
        const alertOverlay = document.getElementById('alertOverlay');
        const sensitivitySlider = document.getElementById('sensitivitySlider');
        const sensitivityValue = document.getElementById('sensitivityValue');
        const timeSlider = document.getElementById('timeSlider');
        const timeValue = document.getElementById('timeValue');
        const connectionStatus = document.getElementById('connectionStatus');
        
        async function speakZundamon(text) {
            try {
                const queryResponse = await fetch(
                    `${VOICEVOX_URL}/audio_query?text=${encodeURIComponent(text)}&speaker=${ZUNDAMON_SPEAKER_ID}`,
                    { method: 'POST' }
                );
                
                if (!queryResponse.ok) throw new Error('VOICEVOX error');
                
                const query = await queryResponse.json();
                query.speedScale = 1.1;
                query.volumeScale = 1.5;
                
                const synthesisResponse = await fetch(
                    `${VOICEVOX_URL}/synthesis?speaker=${ZUNDAMON_SPEAKER_ID}`,
                    {
                        method: 'POST',
                        headers: { 'Content-Type': 'application/json' },
                        body: JSON.stringify(query)
                    }
                );
                
                if (!synthesisResponse.ok) throw new Error('Synthesis error');
                
                const audioBlob = await synthesisResponse.blob();
                const audioUrl = URL.createObjectURL(audioBlob);
                
                return new Promise((resolve) => {
                    if (currentAudio) {
                        currentAudio.pause();
                        currentAudio = null;
                    }
                    
                    currentAudio = new Audio(audioUrl);
                    currentAudio.onended = () => {
                        URL.revokeObjectURL(audioUrl);
                        resolve();
                    };
                    currentAudio.onerror = () => resolve();
                    currentAudio.play();
                });
                
            } catch (error) {
                console.error('VOICEVOX error:', error);
                return speakFallback(text);
            }
        }
        
        function speakFallback(text) {
            return new Promise((resolve) => {
                const utterance = new SpeechSynthesisUtterance(text);
                utterance.lang = 'ja-JP';
                utterance.rate = 1.1;
                utterance.pitch = 1.8;
                utterance.onend = () => resolve();
                speechSynthesis.speak(utterance);
            });
        }
        
        async function checkVoicevoxConnection() {
            try {
                const response = await fetch(`${VOICEVOX_URL}/version`);
                return response.ok;
            } catch (error) {
                return false;
            }
        }
        
        async function triggerAlarm() {
            if (isAlarmPlaying) return;
            isAlarmPlaying = true;
            
            alertOverlay.classList.add('active');
            
            const phrase = zundamonPhrases[Math.floor(Math.random() * zundamonPhrases.length)];
            await speakZundamon(phrase);
            
            if (lastEAR > EAR_THRESHOLD) {
                alertOverlay.classList.remove('active');
                isAlarmPlaying = false;
                speakZundamon("起きたのだ!えらいのだ!");
            } else {
                isAlarmPlaying = false;
                setTimeout(() => {
                    if (lastEAR <= EAR_THRESHOLD && isMonitoring) {
                        triggerAlarm();
                    } else {
                        alertOverlay.classList.remove('active');
                    }
                }, 1000);
            }
        }
        
        function calculateEAR(landmarks) {
            const leftEye = {
                top: landmarks[159],
                bottom: landmarks[145],
                left: landmarks[33],
                right: landmarks[133]
            };
            
            const rightEye = {
                top: landmarks[386],
                bottom: landmarks[374],
                left: landmarks[362],
                right: landmarks[263]
            };
            
            const calcSingleEAR = (eye) => {
                const verticalDist = Math.sqrt(
                    Math.pow(eye.top.x - eye.bottom.x, 2) + 
                    Math.pow(eye.top.y - eye.bottom.y, 2)
                );
                const horizontalDist = Math.sqrt(
                    Math.pow(eye.left.x - eye.right.x, 2) + 
                    Math.pow(eye.left.y - eye.right.y, 2)
                );
                return verticalDist / (horizontalDist + 0.0001);
            };
            
            return (calcSingleEAR(leftEye) + calcSingleEAR(rightEye)) / 2;
        }
        
        function onResults(results) {
            canvasElement.width = videoElement.videoWidth;
            canvasElement.height = videoElement.videoHeight;
            canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
            
            if (results.multiFaceLandmarks && results.multiFaceLandmarks.length > 0) {
                const ear = calculateEAR(results.multiFaceLandmarks[0]);
                lastEAR = ear;
                earChip.textContent = `EAR: ${ear.toFixed(3)}`;
                
                if (ear < EAR_THRESHOLD) {
                    statusDot.classList.add('danger');
                    
                    if (eyeClosedStartTime === null) {
                        eyeClosedStartTime = Date.now();
                    } else {
                        const closedDuration = Date.now() - eyeClosedStartTime;
                        statusText.textContent = `${(closedDuration / 1000).toFixed(1)}秒`;
                        
                        if (closedDuration >= CLOSED_DURATION_MS) {
                            triggerAlarm();
                        }
                    }
                } else {
                    statusDot.classList.remove('danger');
                    statusText.textContent = '監視中';
                    eyeClosedStartTime = null;
                }
            } else {
                statusText.textContent = '顔未検出';
                earChip.textContent = 'EAR: ---';
                eyeClosedStartTime = null;
            }
        }
        
        async function startCamera() {
            try {
                faceMesh = new FaceMesh({
                    locateFile: (file) => `https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/${file}`
                });
                
                faceMesh.setOptions({
                    maxNumFaces: 1,
                    refineLandmarks: true,
                    minDetectionConfidence: 0.5,
                    minTrackingConfidence: 0.5
                });
                
                faceMesh.onResults(onResults);
                
                camera = new Camera(videoElement, {
                    onFrame: async () => {
                        if (isMonitoring) {
                            await faceMesh.send({ image: videoElement });
                        }
                    },
                    width: 640,
                    height: 480
                });
                
                await camera.start();
                
                permissionScreen.classList.add('hidden');
                videoElement.classList.remove('hidden');
                canvasElement.classList.remove('hidden');
                statusChip.classList.remove('hidden');
                earChip.classList.remove('hidden');
                controls.classList.remove('hidden');
                mainContent.classList.add('active');
                isMonitoring = true;
                
                speakZundamon("監視開始なのだ!");
                
            } catch (error) {
                console.error('Camera error:', error);
                alert('カメラの起動に失敗しました');
            }
        }
        
        function stopMonitoring() {
            isMonitoring = false;
            alertOverlay.classList.remove('active');
            
            if (currentAudio) {
                currentAudio.pause();
                currentAudio = null;
            }
            
            if (camera) camera.stop();
            
            videoElement.classList.add('hidden');
            canvasElement.classList.add('hidden');
            statusChip.classList.add('hidden');
            earChip.classList.add('hidden');
            controls.classList.add('hidden');
            mainContent.classList.remove('active');
            permissionScreen.classList.remove('hidden');
        }
        
        startBtn.addEventListener('click', startCamera);
        stopBtn.addEventListener('click', stopMonitoring);
        
        testBtn.addEventListener('click', () => {
            const phrase = zundamonPhrases[Math.floor(Math.random() * zundamonPhrases.length)];
            speakZundamon(phrase);
        });
        
        sensitivitySlider.addEventListener('input', (e) => {
            EAR_THRESHOLD = parseFloat(e.target.value);
            sensitivityValue.textContent = e.target.value;
        });
        
        timeSlider.addEventListener('input', (e) => {
            const seconds = parseFloat(e.target.value);
            CLOSED_DURATION_MS = seconds * 1000;
            timeValue.textContent = `${seconds.toFixed(1)}秒`;
        });
        
        async function init() {
            const isConnected = await checkVoicevoxConnection();
            const text = connectionStatus.querySelector('.connection-text');
            
            if (isConnected) {
                connectionStatus.classList.add('connected');
                connectionStatus.classList.remove('disconnected');
                text.textContent = 'VOICEVOX 接続済み';
            } else {
                connectionStatus.classList.add('disconnected');
                connectionStatus.classList.remove('connected');
                text.textContent = 'VOICEVOX 未接続';
            }
        }
        
        init();
    </script>
</body>
</html>

4.2. MediaPipe FaceMeshで顔・目検出

MediaPipe FaceMeshはリアルタイムに468個の3D顔ランドマークを推定するものです。

MediaPipe FaceMeshをCDN経由で読み込み

faceMesh = new FaceMesh({
  locateFile: (file) => `https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/${file}`
});

4.3. EAR(Eye Aspect Ratio)による居眠り検知

  • 目の縦距離 / 横距離からEAR(Eye Aspect Ratio)を算出
  • 左右の目の平均を取ることで片目検出ミスを緩和
  • 閾値以下が 一定時間以上継続した場合のみアラート
return verticalDist / (horizontalDist + 0.0001);

4.4. VOICEVOX(ずんだもん)を使ったローカル音声合成連携

ローカル起動した VOICEVOX Engine と HTTP 通信

const queryResponse = await fetch(
  `${VOICEVOX_URL}/audio_query?text=...&speaker=3`,
  { method: 'POST' }
);

5. 実行

ターミナルでVOICEVOXエンジンを起動する。

docker run -it -p 50021:50021 voicevox/voicevox_engine:cpu-ubuntu20.04-latest

ブラウザで確認できます。表示されればOKです。

http://127.0.0.1:50021/docs

ローカルサーバーを起動する。

cd (HTMLファイルがあるフォルダ)
python3 -m http.server 8000

ブラウザで開いてアプリが確認できれば大丈夫です。

http://localhost:8000/ファイル名.html

6. まとめ

今回は居眠り防止アプリを作ってみましたが、ずんだもんの声はふわふわしているので、眠気を覚ますには少々弱い気がしました、、、

もっと語気の強い音声などあれば、変わるかもしれません。

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?