More than 1 year has passed since last update.

フューチャーAdvent Calendar 2023

@yamat2667(Yamat)in

フューチャー株式会社

【Flutter】AR × GPT-V API で食べ物のカロリーに応じたARモデルを表示してみた

Last updated at 2023-12-15Posted at 2023-12-15

はじめに

フューチャーアドベントカレンダー2023の14日目です。
(ちなみに今日は15日目です。どうして。。)

Global Design Group(GDG)所属の山本です。

今年を振り返ると、個人的にはFlutterを勉強することが多い一年でした。なので年末のアドカレとして、今年はやっていた技術として、Flutter, AR, GPT-4 V(ision)を入れて遊んでみました。

作ったもの

こんな感じのものを作りました。大まかな操作の流れとしては以下のような感じです。

画面上の物体の特徴点をタップ
GPT-4 Vision APIに画面上の食品の情報(カロリー、栄養素など)を問い合わせ
レスポンスを画面上に表示
カロリー量に応じたARモデルを表示する

食品が認識されなかったときには、狐くんが小さく、　
また、ビッグマックのように巨大なカロリーのときには巨大なモデルとなるように表示してます。

(これをするためにわざわざUberしました)

以下のセクションでは、いくつか実装例を交えて説明します。

環境

今回の実装では、Androidの実機で検証しました。iOSは触ってません

OS: Arch Linux
端末: Pixel 3a
Flutter: 3.16.0
Android Studio: Hedgehog | 2023.1.1

ライブラリ

pubspec.yamlに記載したFlutterパッケージとしては、以下のようなものを使いました。

※AR表示
https://pub.dev/packages/ar_flutter_plugin

※Openaiへのリクエスト
https://pub.dev/packages/http

※その他
https://pub.dev/packages/image_picker
https://pub.dev/packages/flutter_dotenv
https://pub.dev/packages/path_provider
https://pub.dev/packages/path

ARモデル

のARモデルを使用させていただきました。狐をお借りしましたが、他にもいろんなモデルがあります。

※The Khronos Group. Licensed as CC-BY 4.0

ARを表示してみる

今回は、ar_flutter_pluginを使用しました。

※AR関連についてはこのプラグインの基礎になったと書かれている、arcore_flutter_pluginというプラグインもありますが自分の環境では表示がうまくできなかったです

導入方法

flutter pub add ar_flutter_plugin

導入方法としては、上記で追加するだけでした。自分のAndroidの場合、AndroidManifestやbuild.gradleは特にいじらずとも大丈夫でした。

実装してみた

リポジトリにサンプルがおいてあるので、それを参考に実装しています。サンプルとしては、オブジェクトの配置や回転、アップロードやダウンロードなど7つの例が示されていてわかりやすかったです。

@override
Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: const Text('AR+Foods Camera')),
      body: FutureBuilder<void>(
        future: _initializeControllerFuture,
        builder: (context, snapshot) {
          if (snapshot.connectionState == ConnectionState.done) {
            return Column(
              children: [
                Expanded(
                    child: (isCameraPreviewActive && isCameraInitialized)
                        ? Stack(
                            children: [
                              Positioned.fill(
                                child: ARView(
                                  onARViewCreated: _onArViewCreated,
                                  planeDetectionConfig: PlaneDetectionConfig
                                      .horizontalAndVertical,
                                ),
                              ),
                              if (_resText != null)
                                ResponseCard(content: _resText!),
                              Align(
                                alignment: FractionalOffset.bottomCenter,
                                child: Row(
                                    mainAxisAlignment:
                                        MainAxisAlignment.spaceEvenly,
                                    children: [
                                      ElevatedButton(
                                        onPressed: onRemoveEverything,
                                        child: const Text("Remove Everything"),
                                      ),
                                    ]),
                              ),
                            ],
                          )
                        : _appDescriptionWidget()),
                _cameraButtonWidget(context),
              ],
            );
          } else {
            return const Center(child: CircularProgressIndicator());
          }
        },
      ),
    );
  }

余計なものも入ってますが、描画部分としては上記のようにしました。
以下が、AR描画部分です。

ARView(
      onARViewCreated: _onArViewCreated,
      planeDetectionConfig: PlaneDetectionConfig
          .horizontalAndVertical,
),

onARViewCreatedについては、以下のように実装してます。

void _onArViewCreated(
      ARSessionManager arSessionManager,
      ARObjectManager arObjectManager,
      ARAnchorManager arAnchorManager,
      ARLocationManager arLocationManager) async {
    this.arSessionManager = arSessionManager;
    this.arObjectManager = arObjectManager;
    this.arAnchorManager = arAnchorManager;

    this.arSessionManager!.onInitialize(
          showFeaturePoints: true,
          showPlanes: true,
          showWorldOrigin: false,
          handlePans: true,
          handleRotation: true,
        );
    this.arObjectManager!.onInitialize();

    this.arSessionManager!.onPlaneOrPointTap = onPlaneOrPointTapped;
    this.arObjectManager!.onPanStart = onPanStarted;
    this.arObjectManager!.onPanChange = onPanChanged;
    this.arObjectManager!.onPanEnd = onPanEnded;
    this.arObjectManager!.onRotationStart = onRotationStarted;
    this.arObjectManager!.onRotationChange = onRotationChanged;
    this.arObjectManager!.onRotationEnd = onRotationEnded;
    this.arObjectManager!.onNodeTap = onNodeTapped;

    await copyAssetModelsToDocumentDirectory();
  }

copyAssetModelsToDocumentDirectoryについては、ローカルのARモデルファイルが読み込めない不具合がAndroidのissueとして上がっており、そのdisucussionに書かれていたものを引っ張ってきています。

上記までが初期化の処理の実装です。ここまでできると、描画したときに以下で指定したpointsなどが画面に表示されるようになります。

this.arSessionManager!.onInitialize(
          showFeaturePoints: true,
          showPlanes: true,
          showWorldOrigin: false,
          handlePans: true,
          handleRotation: true,
        );

キーボードの部分がFeaturePointsとして認識されました！

次に、ARモデルの追加部分です。今回実装したものとしては以下のようにしたのでonPlaneOrPointTappedを例としてあげます。
(※実装はごちゃってますが。。)

画面上の食品のポイントをタップ -> GPT-4Vにリクエスト ->画面上にモデル表示

Future<void> onPlaneOrPointTapped(
      List<ARHitTestResult> hitTestResults) async {
    var singleHitTestResult = hitTestResults.firstWhere((hitTestResult) =>
        hitTestResult.type == ARHitTestResultType.plane ||
        hitTestResult.type == ARHitTestResultType.point);
    if (isCameraInitialized) {
      var newAnchor =
          ARPlaneAnchor(transformation: singleHitTestResult.worldTransform);
      bool? didAddAnchor = await arAnchorManager!.addAnchor(newAnchor);
      if (didAddAnchor != null && didAddAnchor) {
        final pickedFile = await _picker.pickImage(source: ImageSource.camera);
        // final pickedFile = await _controller.takePicture();

        final imagePath = pickedFile!.path;

        final response = await sendImageToGPT4Vision(imagePath);
        final String? resText = await processChatGPTResponse(response!);
        final resCalories = extractCalories(resText ?? "");
        final caroriesRatio = max(resCalories, 50) / 500;

        anchors.add(newAnchor);

        var newNode = ARNode(
            type: NodeType.fileSystemAppFolderGLB,
            uri: "Fox.glb",
            scale: vec.Vector3(
                0.1 * caroriesRatio, 0.1 * caroriesRatio, 0.1 * caroriesRatio),
            position: vec.Vector3(0.0, 0.0, 0.0),
            rotation: vec.Vector4(1.0, 0.0, 0.0, 0.0));
        bool? didAddNodeToAnchor =
            await arObjectManager!.addNode(newNode, planeAnchor: newAnchor);

        if (didAddNodeToAnchor != null && didAddNodeToAnchor) {
          nodes.add(newNode);
        } else {
          arSessionManager!.onError("Adding Node to Anchor failed");
        }
      } else {
        arSessionManager!.onError("Adding Anchor failed");
      }
    }
  }

上記の中でも、モデルの追加部分としては以下です。

ローカルのglbファイルの狐くんを表示するようにしています。

anchors.add(newAnchor);

var newNode = ARNode(
    type: NodeType.fileSystemAppFolderGLB,
    uri: "Fox.glb",
    scale: vec.Vector3(
        0.1 * caroriesRatio, 0.1 * caroriesRatio, 0.1 * caroriesRatio),
    position: vec.Vector3(0.0, 0.0, 0.0),
    rotation: vec.Vector4(1.0, 0.0, 0.0, 0.0));
bool? didAddNodeToAnchor =
    await arObjectManager!.addNode(newNode, planeAnchor: newAnchor);

Androidではこの部分で不具合が発生していしまい、assets/Fox.glbに配置したファイルが読み込みできませんでした。

そのため、以下のdisucussionを参考に、一度書き込みをするようにしたらうまく行きました。

Future<void> copyAssetModelsToDocumentDirectory() async {
    List<String> filesToCopy = [
      "assets/RiggedFigure.glb",
      "assets/RobotExpressive.glb",
      "assets/Fox.glb"
    ];

    final Directory docDir = await getApplicationDocumentsDirectory();
    final String docDirPath = docDir.path;

    await Future.wait(
      filesToCopy.map((String assetPath) async {
        String assetFilename = assetPath.split('/').last;
        File file = File('$docDirPath/$assetFilename');

        final assetBytes = await rootBundle.load(assetPath);
        final buffer = assetBytes.buffer;

        await file.writeAsBytes(
          buffer.asUint8List(
            assetBytes.offsetInBytes,
            assetBytes.lengthInBytes,
          ),
        );

        debugPrint("Copied $assetPath to ${file.path}");
      }),
    );

    debugPrint("Finished copying files to app's documents directory");
  }

また、ローカルファイルの読み込み以外にWeb上のARモデルも読み込むことができます。

ここについては、公式のサンプル通りではDuck.glbを表示することはできましたが、他のファイルだとうまく読み込めないissueが上がっています。

その対応としては、以下のようにraw=trueを付与したら何故か読み込めました。

var newNode = ARNode(
            type: NodeType.webGLB,
            uri: "hhttps://github.com/KhronosGroup/glTF-Sample-Assets/blob/main/Models/Fox/glTF-Binary/Fox.glb?raw=true",
            scale: Vector3(0.2, 0.2, 0.2),
            position: Vector3(0.0, 0.0, 0.0),
            rotation: Vector4(1.0, 0.0, 0.0, 0.0),
            data: {"onTapText": "Ouch, that hurt!"});

さて、ここまででモデルの追加は完了です！

実際にGPT-V APIからのレスポンスをもとに、カロリー量に応じてモデルのサイズをいじったり、表示するモデルを出し分けたりなど色々な応用ができそうですね！

GPT-4 Vision APIを使ってみる

GPT-V

Openaiが今年の秋頃に発表した画像を扱えるGPTモデルです。

ChatGPTでは当初テキストのみの入力を与えて、返答する形式でしたが、これにより画像も入力として扱えるようになったすごいやつです。

Flutterで使ってみる

今回は画像を入力として、問い合わせをしたかっただけなのでopenaiのプラグインなどは導入してません。
HTTPリクエストとして投げてみました。実装としては以下のような感じです。

await dotenv.load(fileName: '.env');
final String? apiKey = dotenv.get('OPENAI_API_KEY'); // 環境変数などからAPIキーを取得
  if (apiKey == null) {
    throw Exception('APIKEYの設定が必要です。');
  }

  try {
    final file = File(imagePath);
    final bytes = await file.readAsBytes();
    final base64Image = base64Encode(bytes);

    final headers = {
      "Content-Type": "application/json",
      "Authorization": "Bearer $apiKey"
    };

    final payload = {
      "model": "gpt-4-vision-preview",
      "messages": [
        {
          "role": "user",
          "content": [
            {"type": "text", "text": prompt},
            {
              "type": "image_url",
              "image_url": {"url": "data:image/jpeg;base64,$base64Image"}
            }
          ]
        }
      ],
      "max_tokens": 1200
    };

    final response = await http.post(
      Uri.parse("https://api.openai.com/v1/chat/completions"),
      headers: headers,
      body: jsonEncode(payload),
    );

    if (response.statusCode == 200) {
      final responseData = jsonDecode(response.body);
      return ChatGPTResponse(
          text: responseData['choices'][0]['message']['content']);
    } else {
      throw Exception('GPT-4 Vision APIからのレスポンスの取得に失敗しました。');
    }
  } catch (e) {
    debugPrint("sendImage error: ${e.toString()}");
  }

特に面白い部分はないですが、画像の入力としてはbase64形式にencodeして上げる必要があります。

今回は、image_pickerで撮影・保存した画像を読み込んで変換する形式としました。
※cameraパッケージでやりたかったのですが、自分の端末だと何故か撮影できず断念しました

プロンプトとしては、以下のような文章にしてます。はっきり書いたつもりですが、期待した形式のもので帰ってこないこともあるので要改善点です。。

 final String prompt = '''
  # 質問内容
  この画像に含まれている、食品の栄養素が知りたいです。
  食品名、カロリー、栄養素について、推定して教えてください。画像に食品が含まれない場合は、例外出力例に記載したテキストを返答としてください。
  また、推定値に幅がある場合には最大値をレスポンスとしてください。
  また、出力はJSON形式が必ず先頭になるようなテキストとしてください。

  # 出力項目
  - 食品名
  食品名の文字列。わからない場合には、nullとしてください

  - カロリー
  食品のカロリー。わからない場合には、nullとしてください

  - タンパク質
  食品に含まれるタンパク質量。わからない場合には、nullとしてください

  - 脂質
  食品に含まれる脂質量。わからない場合には、nullとしてください

  - 炭水化物
  食品に含まれる炭水化物量。わからない場合には、nullとしてください

  # 出力形式
  データは次の形式のJSON文字列で返してください。

  {
    "食品名": string,
    "カロリー": string,
    "タンパク質": string,
    "脂質": string,
    "炭水化物": string,
  }

  # 出力例

  {
    "食品名": "ラーメン",
    "カロリー": "760 kcal",
    "タンパク質": "10 g",
    "脂質": "16 g",
    "炭水化物": "63 g",
  }

  (なにか説明文があれば)

  # 例外出力例(画像に食品が含まれない場合)

  画像に食品が含まれていないようです

  ''';

上記までのリクエストを投げることで、数秒かかりますがGPT-V APIからレスポンスが帰ってくるのでそれを画面上に表示してます。

まとめ

実装にあらはありますが、突貫1日でARとGPT-V APIを組み合わせたFlutterアプリを作ることができました。
プラグインのREADMEなども丁寧なので、Flutterでとても実装しやすかったです！

今回は食べ物のカロリーに合わせて巨大にしたARモデルを表示してみましたが、栄養素に応じた色にしたり、

食生活にあわせたモデル表示をしたりと遊び方は色々ありそうです！(間に合いませんでしたが)

意外とこれらをFlutterで一貫してやっている記事は見当たらなかったので、どなたかの参考になれば幸いです。

明日は @ShimizuJimmy さんの「AIによる語の揺らぎの同定について」です。

※最後に実装をおいておきます。

実装サンプル

main.dart

import 'package:flutter/material.dart';
import 'package:camera/camera.dart';
import 'package:flutter_ar_food/cameraApp.dart';

void main() async {
  WidgetsFlutterBinding.ensureInitialized();

  final cameras = await availableCameras();
  final firstCamera = cameras.first;

  runApp(
    MaterialApp(
      title: 'AR+Food Demo',
      theme: ThemeData(
        colorScheme: ColorScheme.fromSeed(seedColor: Colors.deepPurple),
        useMaterial3: true,
      ),
      home: CameraApp(camera: firstCamera),
    ),
  );
}

caemraApp.dart

import 'dart:io';
import 'dart:math';
import 'package:ar_flutter_plugin/datatypes/config_planedetection.dart';
import 'package:ar_flutter_plugin/datatypes/hittest_result_types.dart';
import 'package:ar_flutter_plugin/datatypes/node_types.dart';
import 'package:ar_flutter_plugin/managers/ar_anchor_manager.dart';
import 'package:ar_flutter_plugin/managers/ar_location_manager.dart';
import 'package:ar_flutter_plugin/managers/ar_object_manager.dart';
import 'package:ar_flutter_plugin/managers/ar_session_manager.dart';
import 'package:ar_flutter_plugin/models/ar_anchor.dart';
import 'package:ar_flutter_plugin/models/ar_hittest_result.dart';
import 'package:ar_flutter_plugin/models/ar_node.dart';
import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:camera/camera.dart';
import 'package:flutter_ar_food/openai.dart';
import 'package:ar_flutter_plugin/ar_flutter_plugin.dart';
import 'package:flutter_ar_food/responseCard.dart';
import 'package:image_picker/image_picker.dart';
import 'package:path_provider/path_provider.dart';
import 'package:vector_math/vector_math_64.dart' as vec;

class CameraApp extends StatefulWidget {
  final CameraDescription camera;

  const CameraApp({Key? key, required this.camera}) : super(key: key);

  @override
  CameraAppState createState() => CameraAppState();
}

class CameraAppState extends State<CameraApp> {
  late CameraController _controller;
  late Future<void> _initializeControllerFuture;
  bool isCameraInitialized = false;
  bool isCameraPreviewActive = false;

  late ARSessionManager? arSessionManager;
  late ARObjectManager? arObjectManager;
  late ARAnchorManager? arAnchorManager;

  List<ARNode> nodes = [];
  List<ARAnchor> anchors = [];
  HttpClient? httpClient;

  String? _resText;

  final _picker = ImagePicker();

  @override
  void initState() {
    super.initState();
    _controller = CameraController(
      widget.camera,
      ResolutionPreset.medium,
      enableAudio: false,
    );
    _initializeControllerFuture = _controller.initialize().then((_) {
      setState(() {
        isCameraInitialized = true;
      });
    });
  }

  @override
  void dispose() {
    _controller.dispose();
    arSessionManager!.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: const Text('AR+Foods Camera')),
      body: FutureBuilder<void>(
        future: _initializeControllerFuture,
        builder: (context, snapshot) {
          if (snapshot.connectionState == ConnectionState.done) {
            return Column(
              children: [
                Expanded(
                    child: (isCameraPreviewActive && isCameraInitialized)
                        ? Stack(
                            children: [
                              Positioned.fill(
                                child: ARView(
                                  onARViewCreated: _onArViewCreated,
                                  planeDetectionConfig: PlaneDetectionConfig
                                      .horizontalAndVertical,
                                ),
                              ),
                              if (_resText != null)
                                ResponseCard(content: _resText!),
                              Align(
                                alignment: FractionalOffset.bottomCenter,
                                child: Row(
                                    mainAxisAlignment:
                                        MainAxisAlignment.spaceEvenly,
                                    children: [
                                      ElevatedButton(
                                        onPressed: onRemoveEverything,
                                        child: const Text("Remove Everything"),
                                      ),
                                    ]),
                              ),
                            ],
                          )
                        : _appDescriptionWidget()),
                _cameraButtonWidget(context),
              ],
            );
          } else {
            return const Center(child: CircularProgressIndicator());
          }
        },
      ),
    );
  }

  void startImageStream() {
    _controller.startImageStream((CameraImage image) {});
  }

  void stopImageStream() {
    _controller.stopImageStream();
  }

  void _onArViewCreated(
      ARSessionManager arSessionManager,
      ARObjectManager arObjectManager,
      ARAnchorManager arAnchorManager,
      ARLocationManager arLocationManager) async {
    this.arSessionManager = arSessionManager;
    this.arObjectManager = arObjectManager;
    this.arAnchorManager = arAnchorManager;

    this.arSessionManager!.onInitialize(
          showFeaturePoints: true,
          showPlanes: true,
          showWorldOrigin: false,
          handlePans: true,
          handleRotation: true,
        );
    this.arObjectManager!.onInitialize();

    this.arSessionManager!.onPlaneOrPointTap = onPlaneOrPointTapped;
    this.arObjectManager!.onPanStart = onPanStarted;
    this.arObjectManager!.onPanChange = onPanChanged;
    this.arObjectManager!.onPanEnd = onPanEnded;
    this.arObjectManager!.onRotationStart = onRotationStarted;
    this.arObjectManager!.onRotationChange = onRotationChanged;
    this.arObjectManager!.onRotationEnd = onRotationEnded;
    this.arObjectManager!.onNodeTap = onNodeTapped;

    await copyAssetModelsToDocumentDirectory();
  }

  Future<void> onRemoveEverything() async {
    anchors.forEach((anchor) {
      this.arAnchorManager!.removeAnchor(anchor);
    });
    anchors = [];
    setState(() {
      _resText = null;
    });
  }

  onPanStarted(String nodeName) {
    debugPrint("Started panning node " + nodeName);
  }

  onPanChanged(String nodeName) {
    debugPrint("Continued panning node " + nodeName);
  }

  onPanEnded(String nodeName, Matrix4 newTransform) {
    debugPrint("Ended panning node " + nodeName);
    final pannedNode =
        this.nodes.firstWhere((element) => element.name == nodeName);
  }

  onRotationStarted(String nodeName) {
    debugPrint("Started rotating node " + nodeName);
  }

  onRotationChanged(String nodeName) {
    debugPrint("Continued rotating node " + nodeName);
  }

  onRotationEnded(String nodeName, Matrix4 newTransform) {
    debugPrint("Ended rotating node " + nodeName);
    final rotatedNode =
        this.nodes.firstWhere((element) => element.name == nodeName);
  }

  Future<void> onPlaneOrPointTapped(
      List<ARHitTestResult> hitTestResults) async {
    var singleHitTestResult = hitTestResults.firstWhere((hitTestResult) =>
        hitTestResult.type == ARHitTestResultType.plane ||
        hitTestResult.type == ARHitTestResultType.point);
    if (isCameraInitialized) {
      var newAnchor =
          ARPlaneAnchor(transformation: singleHitTestResult.worldTransform);
      bool? didAddAnchor = await arAnchorManager!.addAnchor(newAnchor);
      if (didAddAnchor != null && didAddAnchor) {
        final pickedFile = await _picker.pickImage(source: ImageSource.camera);
        // final pickedFile = await _controller.takePicture();

        final imagePath = pickedFile!.path;

        final response = await sendImageToGPT4Vision(imagePath);
        final String? resText = await processChatGPTResponse(response!);
        final resCalories = extractCalories(resText ?? "");
        final caroriesRatio = max(resCalories, 50) / 500;

        anchors.add(newAnchor);

        var newNode = ARNode(
            type: NodeType.fileSystemAppFolderGLB,
            uri: "Fox.glb",
            scale: vec.Vector3(
                0.1 * caroriesRatio, 0.1 * caroriesRatio, 0.1 * caroriesRatio),
            position: vec.Vector3(0.0, 0.0, 0.0),
            rotation: vec.Vector4(1.0, 0.0, 0.0, 0.0));
        bool? didAddNodeToAnchor =
            await arObjectManager!.addNode(newNode, planeAnchor: newAnchor);

        if (didAddNodeToAnchor != null && didAddNodeToAnchor) {
          nodes.add(newNode);
        } else {
          arSessionManager!.onError("Adding Node to Anchor failed");
        }
      } else {
        arSessionManager!.onError("Adding Anchor failed");
      }
    }
  }

  Future<void> copyAssetModelsToDocumentDirectory() async {
    List<String> filesToCopy = [
      "assets/RiggedFigure.glb",
      "assets/RobotExpressive.glb",
      "assets/Fox.glb"
    ];

    final Directory docDir = await getApplicationDocumentsDirectory();
    final String docDirPath = docDir.path;

    await Future.wait(
      filesToCopy.map((String assetPath) async {
        String assetFilename = assetPath.split('/').last;
        File file = File('$docDirPath/$assetFilename');

        final assetBytes = await rootBundle.load(assetPath);
        final buffer = assetBytes.buffer;

        await file.writeAsBytes(
          buffer.asUint8List(
            assetBytes.offsetInBytes,
            assetBytes.lengthInBytes,
          ),
        );

        debugPrint("Copied $assetPath to ${file.path}");
      }),
    );

    debugPrint("Finished copying files to app's documents directory");
  }

  Future<void> onNodeTapped(List<String> nodes) async {
    var number = nodes.length;
    this.arSessionManager!.onError("Tapped $number node(s)");
  }

  Future<String?> processChatGPTResponse(ChatGPTResponse response) async {
    if (response.text != null) {
      setState(() {
        _resText = response.text;
      });
    }
    return response.text;
  }

  int extractCalories(String response) {
    // カロリーの値を探すための正規表現パターン
    final RegExp regex = RegExp(r'\s*"(\d+|\d+\.\d+)\s*kcal"');

    // レスポンス文字列からカロリーの値を検索
    final match = regex.firstMatch(response);

    if (match != null && match.groupCount >= 1) {
      // カロリーの値が見つかった場合、数値部分を取得し整数に変換
      return int.tryParse(match.group(1)!) ?? 0;
    } else {
      // カロリーの値が見つからなかった場合、0を返す
      return 0;
    }
  }

  Widget _cameraButtonWidget(BuildContext context) {
    return Row(
      mainAxisAlignment: MainAxisAlignment.spaceBetween,
      children: <Widget>[
        isCameraPreviewActive
            ? IconButton(
                icon: Icon(
                    isCameraPreviewActive ? Icons.arrow_back : Icons.start),
                onPressed: () async {
                  if (isCameraPreviewActive) {
                    setState(() {
                      isCameraPreviewActive = false;
                    });
                  } else {
                    setState(() {
                      isCameraPreviewActive = true;
                    });
                  }
                },
              )
            : Container(width: 50),
        IconButton(
          icon: Icon(isCameraPreviewActive ? Icons.camera : Icons.camera_alt),
          onPressed: () async {
            if (isCameraPreviewActive) {
              setState(() {
                isCameraPreviewActive = false;
              });
            } else {
              setState(() {
                isCameraPreviewActive = true;
              });
            }
          },
        ),
        Container(width: 50),
      ],
    );
  }

  Widget _appDescriptionWidget() {
    return const Center(
      child: Padding(
        padding: EdgeInsets.all(16.0),
        child: Text(
          'このアプリでは、カメラを使って食品を認識し、その情報を表示します。カメラアイコンをタップしてカメラを起動してください。',
          style: TextStyle(fontSize: 16.0),
          textAlign: TextAlign.center,
        ),
      ),
    );
  }
}

```　

```dart: responseCart.dart
import 'package:flutter/material.dart';

class ResponseCard extends StatefulWidget {
  final String content; // カードに表示するレスポンスの内容

  ResponseCard({Key? key, required this.content}) : super(key: key);

  @override
  _ResponseCardState createState() => _ResponseCardState();
}

class _ResponseCardState extends State<ResponseCard> {
  @override
  void initState() {
    super.initState();
  }

  @override
  Widget build(BuildContext context) {
    // 画面サイズを取得
    final screenSize = MediaQuery.of(context).size;

    return Stack(
      children: <Widget>[
        Positioned(
          right: 10,
          bottom: 10,
          child: AnimatedOpacity(
            opacity: 1.0,
            duration: const Duration(seconds: 1), // フェードアウトの時間
            child: Card(
              child: Container(
                width: screenSize.width / 3, // カードの幅
                height: screenSize.height / 3, // カードの高さ
                padding: const EdgeInsets.all(8.0),
                child: Text(widget.content),
              ),
            ),
          ),
        ),
      ],
    );
  }
}

openai.dart

import 'dart:convert';
import 'dart:io';
import 'package:flutter/material.dart';
import 'package:image_picker/image_picker.dart';
import 'package:http/http.dart' as http;
import 'package:flutter_dotenv/flutter_dotenv.dart';

Future<String> encodeImageToBase64() async {
  final picker = ImagePicker();
  final pickedFile = await picker.pickImage(source: ImageSource.camera);

  if (pickedFile != null) {
    final bytes = await File(pickedFile.path).readAsBytes();
    return base64Encode(bytes);
  } else {
    throw Exception('画像が選択されていません');
  }
}

Future<ChatGPTResponse?> sendImageToGPT4Vision(String imagePath) async {
  final HttpResponse response;
  await dotenv.load(fileName: '.env');
  final String prompt = '''
  # 質問内容
  この画像に含まれている、食品の栄養素が知りたいです。
  食品名、カロリー、栄養素について、推定して教えてください。画像に食品が含まれない場合は、例外出力例に記載したテキストを返答としてください。
  また、推定値に幅がある場合には最大値をレスポンスとしてください。
  また、出力はJSON形式が必ず先頭になるようなテキストとしてください。

  # 出力項目
  - 食品名
  食品名の文字列。わからない場合には、nullとしてください

  - カロリー
  食品のカロリー。わからない場合には、nullとしてください

  - タンパク質
  食品に含まれるタンパク質量。わからない場合には、nullとしてください

  - 脂質
  食品に含まれる脂質量。わからない場合には、nullとしてください

  - 炭水化物
  食品に含まれる炭水化物量。わからない場合には、nullとしてください

  # 出力形式
  データは次の形式のJSON文字列で返してください。

  {
    "食品名": string,
    "カロリー": string,
    "タンパク質": string,
    "脂質": string,
    "炭水化物": string,
  }

  # 出力例

  {
    "食品名": "ラーメン",
    "カロリー": "760 kcal",
    "タンパク質": "10 g",
    "脂質": "16 g",
    "炭水化物": "63 g",
  }

  (なにか説明文があれば)

  # 例外出力例(画像に食品が含まれない場合)

  画像に食品が含まれていないようです

  ''';
  final String? apiKey = dotenv.get('OPENAI_API_KEY'); // 環境変数などからAPIキーを取得
  if (apiKey == null) {
    throw Exception('APIKEYの設定が必要です。');
  }

  try {
    final file = File(imagePath);
    final bytes = await file.readAsBytes();
    final base64Image = base64Encode(bytes);

    final headers = {
      "Content-Type": "application/json",
      "Authorization": "Bearer $apiKey"
    };

    final payload = {
      "model": "gpt-4-vision-preview",
      "messages": [
        {
          "role": "user",
          "content": [
            {"type": "text", "text": prompt},
            {
              "type": "image_url",
              "image_url": {"url": "data:image/jpeg;base64,$base64Image"}
            }
          ]
        }
      ],
      "max_tokens": 1200
    };

    final response = await http.post(
      Uri.parse("https://api.openai.com/v1/chat/completions"),
      headers: headers,
      body: jsonEncode(payload),
    );

    if (response.statusCode == 200) {
      final responseData = jsonDecode(response.body);
      return ChatGPTResponse(
          text: responseData['choices'][0]['message']['content']);
    } else {
      throw Exception('GPT-4 Vision APIからのレスポンスの取得に失敗しました。');
    }
  } catch (e) {
    debugPrint("sendImage error: ${e.toString()}");
  }
  return null;
}

class ChatGPTResponse {
  final String? text;

  ChatGPTResponse({this.text});
}

参考文献

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up