ML Kit for Firebase is a machine learning toolkit made by Google for Android and iOS. (Well still in Beta version) With this kit you can use on-device pre-trained APIs :
  • Text recognition
  • Face detection
  • Barcode scanning
  • Image labeling
  • Object detection & tracking
  • Language identification
  • Translation
  • Smart reply generator (only in english)
You can also use cloud APIs for more accurate answer :
  • Text recognition
  • Image labeling
  • Landmark recognition
You can use your custom pre-trained models and you can train your own classification model. (for images labeling only)

Android dependency

In app/build.gradle you should add :
android {
  aaptOptions {
      noCompress "tflite"
dependencies {
  // ml-vision general
  implementation 'com.google.firebase:firebase-ml-vision:24.0.1'
  // Face Detection (contours)
  implementation 'com.google.firebase:firebase-ml-vision-face-model:19.0.0' 
  // Barcode Scanning
  implementation 'com.google.firebase:firebase-ml-vision-barcode-model:16.0.1'
  // Image labeling
  implementation 'com.google.firebase:firebase-ml-vision-image-label-model:19.0.0'
  // Object detection
  implementation 'com.google.firebase:firebase-ml-vision-object-detection-model:19.0.3'
  // ml-natural general
  implementation 'com.google.firebase:firebase-ml-natural-language:22.0.0'
  // Langauge identification
  implementation 'com.google.firebase:firebase-ml-natural-language-language-id-model:20.0.7'
  // Translation
  implementation 'com.google.firebase:firebase-ml-natural-language-translate-model:20.0.7'
  // Smart Replies
  implementation 'com.google.firebase:firebase-ml-natural-language-smart-reply-model:20.0.7'
apply plugin: 'com.google.gms.google-services'

If you want to use a custom pre-trained model(AutoML-trined model) to load your own model you will need to add :
implementation 'com.google.firebase:firebase-ml-vision-automl:18.0.3'

Text recognition

// Create FirebaseVisionImage Object (here from an url)
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);

// Create an instance of FirebaseVisionTextRecognizer with on-device model
FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance().getOnDeviceTextRecognizer();
// Or with cloud model
// FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance().getCloudTextRecognizer();

// Process the image
Task<FirebaseVisionText> result =
                .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
                    public void onSuccess(FirebaseVisionText firebaseVisionText) {
                        // Task completed successfully
                        // ...
                        new OnFailureListener() {
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                // ...

FirebaseVisionText will contain bounding box, text, language recognized, paragraph, confidence score.

Face detection

Face detection is done on device only and you can get facial contours too (optional).
// Create FirebaseVisionImage Object (here from an url)
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);

// Set Options
FirebaseVisionFaceDetectorOptions options =
          new FirebaseVisionFaceDetectorOptions.Builder()
// Create an instance of FirebaseVisionFaceDetector
FirebaseVisionFaceDetector detector = FirebaseVision.getInstance().getVisionFaceDetector(options);

// Process the image
Task<List<FirebaseVisionFace>> result =
                        new OnSuccessListener<List<FirebaseVisionFace>>() {
                            public void onSuccess(List<FirebaseVisionFace> faces) {
                                // Task completed successfully
                                // ...
                        new OnFailureListener() {
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                // ...

As a result FirebaseVisionFace contains : Face bounds, head rotation, eyes ears, mouth nose coordinate, classification probability (smiling, eyes opened, happy ...)
You can also get a tracking Id in case of video streaming.

Barcode scanning

Many different formats are supported :
Code 128, Code 39, Code 93, Codabar, EAN-13, EAN-8, ITF, UPC-A, UPC-E, QR Code, PDF417, Aztec, Data Matrix.
// Create FirebaseVisionImage Object (here from an url)
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);

// Set Options
FirebaseVisionBarcodeDetectorOptions options =
        new FirebaseVisionBarcodeDetectorOptions.Builder()

// Create an instance of FirebaseVisionBarcodeDetector
FirebaseVisionBarcodeDetector detector = FirebaseVision.getInstance().getVisionBarcodeDetector();

// Process the image
Task<List<FirebaseVisionBarcode>> result = detector.detectInImage(image)
        .addOnSuccessListener(new OnSuccessListener<List<FirebaseVisionBarcode>>() {
            public void onSuccess(List<FirebaseVisionBarcode> barcodes) {
                // Task completed successfully
                // ...
        .addOnFailureListener(new OnFailureListener() {
            public void onFailure(@NonNull Exception e) {
                // Task failed with an exception
                // ...

The results will depend on the barcode type

Image labeling

Image labeling can be used on-device with ~400 labels or on-cloud with ~10000 labels
// Create FirebaseVisionImage Object (here from an url)
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);

// Create an instance of FirebaseVisionImageLabeler with on-device model
FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance().getOnDeviceImageLabeler();
// Or with cloud model
// FirebaseVisionCloudImageLabelerOptions options = new FirebaseVisionCloudImageLabelerOptions.Builder().setConfidenceThreshold(0.7f).build();
// FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance().getOnDeviceImageLabeler(options);

// Process the image
    .addOnSuccessListener(new OnSuccessListener<List<FirebaseVisionImageLabel>>() {
      public void onSuccess(List<FirebaseVisionImageLabel> labels) {
        // Task completed successfully
        // ...
    .addOnFailureListener(new OnFailureListener() {
      public void onFailure(@NonNull Exception e) {
        // Task failed with an exception
        // ...

Return is just a list of FirebaseVisionImageLabel who contains : label and confidence score.
You can use AutoML Vision Edge to use your own model of classification.

Object detection & tracking

With this you can identify main object and track it (when streaming)
// Create FirebaseVisionImage Object (here from an url)
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);

// Multiple object detection in static images
FirebaseVisionObjectDetectorOptions options =
        new FirebaseVisionObjectDetectorOptions.Builder()
                .enableClassification()  // Optional

// Create an instance of FirebaseVisionObjectDetector
FirebaseVisionObjectDetector objectDetector = FirebaseVision.getInstance().getOnDeviceObjectDetector(options);

// Process the image
                new OnSuccessListener<List<FirebaseVisionObject>>() {
                    public void onSuccess(List<FirebaseVisionObject> detectedObjects) {
                        // Task completed successfully
                        // ...
                new OnFailureListener() {
                    public void onFailure(@NonNull Exception e) {
                        // Task failed with an exception
                        // ...

Result is a list of FirebaseVisionObject who contains: tracking Id, bounds, category, confidence score.

Landmark recognition

You can recognize well-known landmarks in an image. This api can only be use on-cloud.

// Create FirebaseVisionImage Object (here from an url)
FirebaseVisionImage image = FirebaseVisionImage.fromFilePath(context, uri);

// Set the options
FirebaseVisionCloudDetectorOptions options =
        new FirebaseVisionCloudDetectorOptions.Builder()

// Create an instance of FirebaseVisionCloudLandmarkDetector
FirebaseVisionCloudLandmarkDetector detector = FirebaseVision.getInstance().getVisionCloudLandmarkDetector(options);

// Process the image
Task<List<FirebaseVisionCloudLandmark>> result = detector.detectInImage(image)
        .addOnSuccessListener(new OnSuccessListener<List<FirebaseVisionCloudLandmark>>() {
            public void onSuccess(List<FirebaseVisionCloudLandmark> firebaseVisionCloudLandmarks) {
                // Task completed successfully
                // ...
        .addOnFailureListener(new OnFailureListener() {
            public void onFailure(@NonNull Exception e) {
                // Task failed with an exception
                // ...

Result is a list of FirebaseVisionCloudLandmark who contains: Name, bounds, latitude, longitude, confidence score.

Language identification

This api doesn't use image but String.
FirebaseLanguageIdentification languageIdentifier =
          new OnSuccessListener<String>() {
            public void onSuccess(List<IdentifiedLanguage> identifiedLanguages) {
              for (IdentifiedLanguage identifiedLanguage : identifiedLanguages) {
                String language = identifiedLanguage.getLanguageCode();
                float confidence = identifiedLanguage.getConfidence();
                Log.i(TAG, language + " (" + confidence + ")");
          new OnFailureListener() {
            public void onFailure(@NonNull Exception e) {
              // Model couldn’t be loaded or other internal error.
              // ...

Result is a list of IdentifiedLanguage who contains: Language Code and confidence score.


Translation can be done with a on-device api but it can be use only for casual and simple translation over 59 languages (Japanese is supported). Model is trained to translate to and from English; so if you choose to translate between non-English languages, English will be used as an intermediate translation, which can affect quality.
// Create an English-Japanese translator:
FirebaseTranslatorOptions options =
        new FirebaseTranslatorOptions.Builder()
final FirebaseTranslator englishJapaneseTranslator =

final String text = "Merry Christmas";

// We need to download the model first
// Each model is around 30MB and are stored locally to be reused
FirebaseModelDownloadConditions conditions = new FirebaseModelDownloadConditions.Builder()
          new OnSuccessListener<Void>() {
            public void onSuccess(Void v) {
              // Model downloaded successfully. We can start translation
                      new OnSuccessListener<String>() {
                         public void onSuccess(@NonNull String translatedText) {
                            // Translation successful.
                            // translatedText <- "メリークリスマス"
                      new OnFailureListener() {
                          public void onFailure(@NonNull Exception e) {
                             // Error during the translation.
          new OnFailureListener() {
            public void onFailure(@NonNull Exception e) {
              // Model couldn’t be downloaded or other internal error.

Smart reply generator (only in english)

The model work with the 10 most recent messages and provides a maximum of 3 suggested responses.
// Define a conversation history
// Local User speaks to Remote User
// Smart reply is about what Local User may answer.
List<FirebaseTextMessage> conversation = new ArrayList<>();
conversation.add(FirebaseTextMessage.createForRemoteUser("It's Christmas time", System.currentTimeMillis(),"userId1"));
conversation.add(FirebaseTextMessage.createForLocalUser("Kids are happy", System.currentTimeMillis()));
conversation.add(FirebaseTextMessage.createForRemoteUser("Will Santa Claus come tonight ?", System.currentTimeMillis(),"userId1"));

FirebaseSmartReply smartReply = FirebaseNaturalLanguage.getInstance().getSmartReply();
        .addOnSuccessListener(new OnSuccessListener<SmartReplySuggestionResult>() {
            public void onSuccess(SmartReplySuggestionResult result) {
                if (result.getStatus() == SmartReplySuggestionResult.STATUS_NOT_SUPPORTED_LANGUAGE) {
                    // The conversation's language isn't supported, so the
                    // the result doesn't contain any suggestions.
                } else if (result.getStatus() == SmartReplySuggestionResult.STATUS_SUCCESS) {
                    // Task completed successfully
                    for (SmartReplySuggestion suggestion : result.getSuggestions()) {
                        String replyText = suggestion.getText();
        .addOnFailureListener(new OnFailureListener() {
            public void onFailure(@NonNull Exception e) {
                // Task failed with an exception

As result a list of SmartReplySuggestion Object. Each one will contain only a text.


