1. 眠気を覚ましたい!
勉強中眠いときに、眠気を覚ましたいなと思うことがよくあります。
学生の頃、授業で居眠り中に声をかけられたとき、体がビクッとして1番目が覚めた気がしたので、それを再現するようなアプリを作ってみました。
2. できたもの
機能は単純で、設定した秒数目を閉じていると、ずんだもんに声をかけられます。
目を閉じると、検知されて左上に秒数が表示されます。
下にある判定時間の秒数(画像だと3秒)を超えるとずんだもんから声をかけられます。

3. 事前準備
3.1. VOICEVOXエンジンをダウンロード
下記URLからダウンロードします。
私はMacで実装しているため、
- OS:Mac
- 対応モード:CPU(Apple)
- パッケージ:インストーラー
を選択してダウンロードしました。
その後、インストーラーを起動してドラッグアンドドロップしてください。
3.2. Docker Desktopをインストール
下記URLからダウンロードします。
4. 実装
4.1. 全体コード
全体コードはこちらです
<!DOCTYPE html>
<html lang="ja">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>居眠り防止ずんだもんアラート</title>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/face_mesh.js" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
<style>
@import url('https://fonts.googleapis.com/css2?family=Zen+Maru+Gothic:wght@400;500;700&display=swap');
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: 'Zen Maru Gothic', sans-serif;
background: linear-gradient(180deg, #e8f5e3 0%, #d4edda 100%);
min-height: 100vh;
color: #2d4a2d;
}
.container {
max-width: 420px;
margin: 0 auto;
padding: 24px 16px;
min-height: 100vh;
}
/* Header */
header {
text-align: center;
margin-bottom: 24px;
}
h1 {
font-size: 1.5rem;
font-weight: 700;
color: #3d6b3d;
margin-bottom: 4px;
}
.subtitle {
font-size: 0.875rem;
color: #5a8a5a;
}
/* Card */
.card {
background: #fff;
border-radius: 20px;
padding: 20px;
margin-bottom: 16px;
box-shadow: 0 2px 12px rgba(61, 107, 61, 0.1);
}
/* Video area */
.video-area {
position: relative;
border-radius: 16px;
overflow: hidden;
background: #f0f7ef;
aspect-ratio: 4/3;
}
#webcam {
width: 100%;
height: 100%;
object-fit: cover;
transform: scaleX(-1);
}
#canvas {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
pointer-events: none;
}
/* Status chip */
.status-chip {
position: absolute;
top: 12px;
left: 12px;
display: flex;
align-items: center;
gap: 8px;
background: #fff;
padding: 8px 14px;
border-radius: 20px;
font-size: 0.813rem;
font-weight: 500;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}
.status-dot {
width: 10px;
height: 10px;
border-radius: 50%;
background: #7ccd62;
}
.status-dot.danger {
background: #e85d5d;
animation: pulse 0.8s infinite;
}
@keyframes pulse {
0%, 100% { transform: scale(1); opacity: 1; }
50% { transform: scale(1.2); opacity: 0.7; }
}
/* EAR display */
.ear-chip {
position: absolute;
bottom: 12px;
right: 12px;
background: rgba(255, 255, 255, 0.9);
padding: 6px 12px;
border-radius: 12px;
font-size: 0.75rem;
color: #5a8a5a;
font-weight: 500;
}
/* Alert overlay */
.alert-overlay {
position: absolute;
inset: 0;
background: rgba(232, 93, 93, 0.2);
border-radius: 16px;
opacity: 0;
transition: opacity 0.3s;
pointer-events: none;
}
.alert-overlay.active {
opacity: 1;
}
/* Permission screen */
.permission-screen {
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
height: 100%;
text-align: center;
padding: 32px;
}
.permission-icon {
width: 72px;
height: 72px;
background: linear-gradient(135deg, #7ccd62 0%, #a8e063 100%);
border-radius: 50%;
display: flex;
align-items: center;
justify-content: center;
margin-bottom: 16px;
}
.permission-icon svg {
width: 32px;
height: 32px;
stroke: #fff;
}
.permission-screen p {
color: #5a8a5a;
font-size: 0.875rem;
margin-bottom: 20px;
line-height: 1.6;
}
/* Buttons */
.btn {
font-family: 'Zen Maru Gothic', sans-serif;
font-size: 0.938rem;
font-weight: 700;
padding: 14px 28px;
border: none;
border-radius: 14px;
cursor: pointer;
transition: all 0.2s;
}
.btn-primary {
background: linear-gradient(135deg, #7ccd62 0%, #6abb52 100%);
color: #fff;
box-shadow: 0 4px 12px rgba(124, 205, 98, 0.3);
}
.btn-primary:hover {
transform: translateY(-1px);
box-shadow: 0 6px 16px rgba(124, 205, 98, 0.4);
}
.btn-primary:active {
transform: translateY(0);
}
.btn-secondary {
background: #f0f7ef;
color: #3d6b3d;
}
.btn-secondary:hover {
background: #e5f0e3;
}
/* Controls */
.controls {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 12px;
margin-top: 16px;
}
/* Settings */
.setting-item {
margin-bottom: 20px;
}
.setting-item:last-child {
margin-bottom: 0;
}
.setting-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 10px;
}
.setting-label {
font-size: 0.875rem;
font-weight: 500;
color: #3d6b3d;
}
.setting-value {
font-size: 0.875rem;
font-weight: 700;
color: #7ccd62;
}
/* Range slider */
input[type="range"] {
width: 100%;
height: 8px;
-webkit-appearance: none;
appearance: none;
background: #e5f0e3;
border-radius: 4px;
outline: none;
}
input[type="range"]::-webkit-slider-thumb {
-webkit-appearance: none;
appearance: none;
width: 24px;
height: 24px;
background: linear-gradient(135deg, #7ccd62 0%, #6abb52 100%);
border-radius: 50%;
cursor: pointer;
box-shadow: 0 2px 8px rgba(124, 205, 98, 0.4);
transition: transform 0.2s;
}
input[type="range"]::-webkit-slider-thumb:hover {
transform: scale(1.1);
}
/* Connection status */
.connection {
display: flex;
align-items: center;
gap: 10px;
padding: 12px 16px;
background: #f8fdf7;
border-radius: 12px;
font-size: 0.813rem;
border: 1px solid #e5f0e3;
}
.connection-dot {
width: 8px;
height: 8px;
border-radius: 50%;
background: #ccc;
}
.connection.connected .connection-dot {
background: #7ccd62;
}
.connection.connected {
background: #f0fdf0;
border-color: #c8e6c9;
}
.connection.disconnected .connection-dot {
background: #e85d5d;
}
.connection.disconnected {
background: #fef5f5;
border-color: #fdd;
}
.connection-text {
color: #5a8a5a;
font-weight: 500;
}
/* Utility */
.hidden {
display: none !important;
}
.main-content {
display: none;
}
.main-content.active {
display: block;
}
</style>
</head>
<body>
<div class="container">
<header>
<h1>居眠り防止ずんだもんアラート</h1>
<p class="subtitle">居眠りを検出して起こします</p>
</header>
<div class="card">
<div class="video-area" id="videoArea">
<div class="alert-overlay" id="alertOverlay"></div>
<div class="permission-screen" id="permissionScreen">
<div class="permission-icon">
<svg viewBox="0 0 24 24" fill="none" stroke-width="2">
<path d="M23 19a2 2 0 0 1-2 2H3a2 2 0 0 1-2-2V8a2 2 0 0 1 2-2h4l2-3h6l2 3h4a2 2 0 0 1 2 2z"/>
<circle cx="12" cy="13" r="4"/>
</svg>
</div>
<p>カメラを使用して<br>目の状態を監視します</p>
<button class="btn btn-primary" id="startBtn">開始する</button>
</div>
<video id="webcam" class="hidden" autoplay playsinline></video>
<canvas id="canvas" class="hidden"></canvas>
<div class="status-chip hidden" id="statusChip">
<div class="status-dot" id="statusDot"></div>
<span id="statusText">監視中</span>
</div>
<div class="ear-chip hidden" id="earChip">EAR: 0.000</div>
</div>
<div class="controls hidden" id="controls">
<button class="btn btn-secondary" id="stopBtn">停止</button>
<button class="btn btn-primary" id="testBtn">テスト再生</button>
</div>
</div>
<div class="card main-content" id="mainContent">
<div class="setting-item">
<div class="setting-header">
<span class="setting-label">検出感度</span>
<span class="setting-value" id="sensitivityValue">0.22</span>
</div>
<input type="range" id="sensitivitySlider" min="0.1" max="0.4" step="0.02" value="0.22">
</div>
<div class="setting-item">
<div class="setting-header">
<span class="setting-label">判定時間</span>
<span class="setting-value" id="timeValue">3.0秒</span>
</div>
<input type="range" id="timeSlider" min="1" max="10" step="0.5" value="3">
</div>
<div class="setting-item">
<div class="connection" id="connectionStatus">
<div class="connection-dot"></div>
<span class="connection-text">確認中...</span>
</div>
</div>
</div>
</div>
<script>
const zundamonPhrases = [
"起きるのだ!",
"寝ちゃダメなのだ!",
"おーい!起きるのだ!",
"ずんだもちパワーで起こすのだ!",
"目を開けるのだ!",
"寝るのは夜だけなのだ!",
"シャキッとするのだ!"
];
let isMonitoring = false;
let eyeClosedStartTime = null;
let isAlarmPlaying = false;
let faceMesh = null;
let camera = null;
let lastEAR = 1;
let EAR_THRESHOLD = 0.22;
let CLOSED_DURATION_MS = 3000;
const VOICEVOX_URL = 'http://localhost:50021';
const ZUNDAMON_SPEAKER_ID = 3;
let currentAudio = null;
const videoElement = document.getElementById('webcam');
const canvasElement = document.getElementById('canvas');
const canvasCtx = canvasElement.getContext('2d');
const permissionScreen = document.getElementById('permissionScreen');
const mainContent = document.getElementById('mainContent');
const controls = document.getElementById('controls');
const startBtn = document.getElementById('startBtn');
const stopBtn = document.getElementById('stopBtn');
const testBtn = document.getElementById('testBtn');
const statusChip = document.getElementById('statusChip');
const statusDot = document.getElementById('statusDot');
const statusText = document.getElementById('statusText');
const earChip = document.getElementById('earChip');
const alertOverlay = document.getElementById('alertOverlay');
const sensitivitySlider = document.getElementById('sensitivitySlider');
const sensitivityValue = document.getElementById('sensitivityValue');
const timeSlider = document.getElementById('timeSlider');
const timeValue = document.getElementById('timeValue');
const connectionStatus = document.getElementById('connectionStatus');
async function speakZundamon(text) {
try {
const queryResponse = await fetch(
`${VOICEVOX_URL}/audio_query?text=${encodeURIComponent(text)}&speaker=${ZUNDAMON_SPEAKER_ID}`,
{ method: 'POST' }
);
if (!queryResponse.ok) throw new Error('VOICEVOX error');
const query = await queryResponse.json();
query.speedScale = 1.1;
query.volumeScale = 1.5;
const synthesisResponse = await fetch(
`${VOICEVOX_URL}/synthesis?speaker=${ZUNDAMON_SPEAKER_ID}`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(query)
}
);
if (!synthesisResponse.ok) throw new Error('Synthesis error');
const audioBlob = await synthesisResponse.blob();
const audioUrl = URL.createObjectURL(audioBlob);
return new Promise((resolve) => {
if (currentAudio) {
currentAudio.pause();
currentAudio = null;
}
currentAudio = new Audio(audioUrl);
currentAudio.onended = () => {
URL.revokeObjectURL(audioUrl);
resolve();
};
currentAudio.onerror = () => resolve();
currentAudio.play();
});
} catch (error) {
console.error('VOICEVOX error:', error);
return speakFallback(text);
}
}
function speakFallback(text) {
return new Promise((resolve) => {
const utterance = new SpeechSynthesisUtterance(text);
utterance.lang = 'ja-JP';
utterance.rate = 1.1;
utterance.pitch = 1.8;
utterance.onend = () => resolve();
speechSynthesis.speak(utterance);
});
}
async function checkVoicevoxConnection() {
try {
const response = await fetch(`${VOICEVOX_URL}/version`);
return response.ok;
} catch (error) {
return false;
}
}
async function triggerAlarm() {
if (isAlarmPlaying) return;
isAlarmPlaying = true;
alertOverlay.classList.add('active');
const phrase = zundamonPhrases[Math.floor(Math.random() * zundamonPhrases.length)];
await speakZundamon(phrase);
if (lastEAR > EAR_THRESHOLD) {
alertOverlay.classList.remove('active');
isAlarmPlaying = false;
speakZundamon("起きたのだ!えらいのだ!");
} else {
isAlarmPlaying = false;
setTimeout(() => {
if (lastEAR <= EAR_THRESHOLD && isMonitoring) {
triggerAlarm();
} else {
alertOverlay.classList.remove('active');
}
}, 1000);
}
}
function calculateEAR(landmarks) {
const leftEye = {
top: landmarks[159],
bottom: landmarks[145],
left: landmarks[33],
right: landmarks[133]
};
const rightEye = {
top: landmarks[386],
bottom: landmarks[374],
left: landmarks[362],
right: landmarks[263]
};
const calcSingleEAR = (eye) => {
const verticalDist = Math.sqrt(
Math.pow(eye.top.x - eye.bottom.x, 2) +
Math.pow(eye.top.y - eye.bottom.y, 2)
);
const horizontalDist = Math.sqrt(
Math.pow(eye.left.x - eye.right.x, 2) +
Math.pow(eye.left.y - eye.right.y, 2)
);
return verticalDist / (horizontalDist + 0.0001);
};
return (calcSingleEAR(leftEye) + calcSingleEAR(rightEye)) / 2;
}
function onResults(results) {
canvasElement.width = videoElement.videoWidth;
canvasElement.height = videoElement.videoHeight;
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
if (results.multiFaceLandmarks && results.multiFaceLandmarks.length > 0) {
const ear = calculateEAR(results.multiFaceLandmarks[0]);
lastEAR = ear;
earChip.textContent = `EAR: ${ear.toFixed(3)}`;
if (ear < EAR_THRESHOLD) {
statusDot.classList.add('danger');
if (eyeClosedStartTime === null) {
eyeClosedStartTime = Date.now();
} else {
const closedDuration = Date.now() - eyeClosedStartTime;
statusText.textContent = `${(closedDuration / 1000).toFixed(1)}秒`;
if (closedDuration >= CLOSED_DURATION_MS) {
triggerAlarm();
}
}
} else {
statusDot.classList.remove('danger');
statusText.textContent = '監視中';
eyeClosedStartTime = null;
}
} else {
statusText.textContent = '顔未検出';
earChip.textContent = 'EAR: ---';
eyeClosedStartTime = null;
}
}
async function startCamera() {
try {
faceMesh = new FaceMesh({
locateFile: (file) => `https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/${file}`
});
faceMesh.setOptions({
maxNumFaces: 1,
refineLandmarks: true,
minDetectionConfidence: 0.5,
minTrackingConfidence: 0.5
});
faceMesh.onResults(onResults);
camera = new Camera(videoElement, {
onFrame: async () => {
if (isMonitoring) {
await faceMesh.send({ image: videoElement });
}
},
width: 640,
height: 480
});
await camera.start();
permissionScreen.classList.add('hidden');
videoElement.classList.remove('hidden');
canvasElement.classList.remove('hidden');
statusChip.classList.remove('hidden');
earChip.classList.remove('hidden');
controls.classList.remove('hidden');
mainContent.classList.add('active');
isMonitoring = true;
speakZundamon("監視開始なのだ!");
} catch (error) {
console.error('Camera error:', error);
alert('カメラの起動に失敗しました');
}
}
function stopMonitoring() {
isMonitoring = false;
alertOverlay.classList.remove('active');
if (currentAudio) {
currentAudio.pause();
currentAudio = null;
}
if (camera) camera.stop();
videoElement.classList.add('hidden');
canvasElement.classList.add('hidden');
statusChip.classList.add('hidden');
earChip.classList.add('hidden');
controls.classList.add('hidden');
mainContent.classList.remove('active');
permissionScreen.classList.remove('hidden');
}
startBtn.addEventListener('click', startCamera);
stopBtn.addEventListener('click', stopMonitoring);
testBtn.addEventListener('click', () => {
const phrase = zundamonPhrases[Math.floor(Math.random() * zundamonPhrases.length)];
speakZundamon(phrase);
});
sensitivitySlider.addEventListener('input', (e) => {
EAR_THRESHOLD = parseFloat(e.target.value);
sensitivityValue.textContent = e.target.value;
});
timeSlider.addEventListener('input', (e) => {
const seconds = parseFloat(e.target.value);
CLOSED_DURATION_MS = seconds * 1000;
timeValue.textContent = `${seconds.toFixed(1)}秒`;
});
async function init() {
const isConnected = await checkVoicevoxConnection();
const text = connectionStatus.querySelector('.connection-text');
if (isConnected) {
connectionStatus.classList.add('connected');
connectionStatus.classList.remove('disconnected');
text.textContent = 'VOICEVOX 接続済み';
} else {
connectionStatus.classList.add('disconnected');
connectionStatus.classList.remove('connected');
text.textContent = 'VOICEVOX 未接続';
}
}
init();
</script>
</body>
</html>
4.2. MediaPipe FaceMeshで顔・目検出
MediaPipe FaceMeshはリアルタイムに468個の3D顔ランドマークを推定するものです。
MediaPipe FaceMeshをCDN経由で読み込み
faceMesh = new FaceMesh({
locateFile: (file) => `https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh/${file}`
});
4.3. EAR(Eye Aspect Ratio)による居眠り検知
- 目の縦距離 / 横距離からEAR(Eye Aspect Ratio)を算出
- 左右の目の平均を取ることで片目検出ミスを緩和
- 閾値以下が 一定時間以上継続した場合のみアラート
return verticalDist / (horizontalDist + 0.0001);
4.4. VOICEVOX(ずんだもん)を使ったローカル音声合成連携
ローカル起動した VOICEVOX Engine と HTTP 通信
const queryResponse = await fetch(
`${VOICEVOX_URL}/audio_query?text=...&speaker=3`,
{ method: 'POST' }
);
5. 実行
ターミナルでVOICEVOXエンジンを起動する。
docker run -it -p 50021:50021 voicevox/voicevox_engine:cpu-ubuntu20.04-latest
ブラウザで確認できます。表示されればOKです。
http://127.0.0.1:50021/docs
ローカルサーバーを起動する。
cd (HTMLファイルがあるフォルダ)
python3 -m http.server 8000
ブラウザで開いてアプリが確認できれば大丈夫です。
http://localhost:8000/ファイル名.html
6. まとめ
今回は居眠り防止アプリを作ってみましたが、ずんだもんの声はふわふわしているので、眠気を覚ますには少々弱い気がしました、、、
もっと語気の強い音声などあれば、変わるかもしれません。

