Azure AI Vision
While you can train your own machine learning models for computer vision, the architecture for computer vision models can be complex; and you require significant volumes of training images and compute power to perform the training process.
Microsoft’s Azure AI Vision service provides prebuilt and customizable computer vision models that are based on the Florence foundation model and provide various powerful capabilities. With Azure AI Vision, you can create sophisticated computer vision solutions quickly and easily; taking advantage of “off-the-shelf” functionality for many common computer vision scenarios, while retaining the ability to create custom models using your own images.
Azure resources for Azure AI Vision service
To use Azure AI Vision, you need to create a resource for it in your Azure subscription. You can use either of the following resource types:
- Azure AI Vision: A specific resource for the Azure AI Vision service. Use this resource type if you don’t intend to use any other Azure AI services, or if you want to track utilization and costs for your Azure AI Vision resource separately.
- Azure AI services: A general resource that includes Azure AI Vision along with many other Azure AI services; such as Azure AI Language, Azure AI Custom Vision, Azure AI Translator, and others. Use this resource type if you plan to use multiple AI services and want to simplify administration and development.
Analyzing images with the Azure AI Vision service
After you’ve created a suitable resource in your subscription, you can submit images to the Azure AI Vision service to perform a wide range of analytical tasks.
Azure AI Vision supports multiple image analysis capabilities, including:
- Optical character recognition (OCR) – extracting text from images.
- Generating captions and descriptions of images.
- Detection of thousands of common objects in images.
- Tagging visual features in images
These tasks, and more, can be performed in Azure AI Vision Studio.
Optical character recognition
Azure AI Vision service can use optical character recognition (OCR) capabilities to detect text in images. For example, consider the following image of a nutrition label on a product in a grocery store:
The Azure AI Vision service can analyze this image and extract the following text:
Nutrition Facts Amount Per Serving
Serving size:1 bar (40g)
Serving Per Package: 4
Total Fat 13g
Saturated Fat 1.5g
Amount Per Serving
Trans Fat 0g
calories 190
Cholesterol 0mg
ories from Fat 110
Sodium 20mg
ntDaily Values are based on
Vitamin A 50
calorie diet