Using the Azure Form Recognizer REST API for scanning IDs
This month, at IGNITE, Microsoft announced a new prebuilt model for scanning documents, and that is the ability to scan Identification cards. In this article we'll explore the following concepts:
- Demo of the feature
- Creating the Form Recognizer in Azure
- Analyzing IDs using C# and the REST API of the Form Recognizer
1. ID service Demo
You can play with this feature if you go to this link, upload a driving license from the US or a passport (these are the only supported documents so far), and choose "ID" in the "Form type" dropdown. You will also need to create your own instance of the Azure Form Recognizer because you will need the key and the endpoint. Here's how an identification result looks like:
Now that we saw how it works, let's create it ourselves in a C# console app. But first, we'll need the key and the endpoint of the service, so let's create one in our Azure subscription.
2. Create the Form Recognizer resource in Azure
Go to you Azure account and create a Form Recognizer.
Once the resource is created, we should copy the key and the endpoint so we can use them in our C# application.
3. Analyzing IDs using C# and the REST API of the Form Recognizer
Here's what we'll do:
- prepare a photo of an ID (I recommend googling for "US driving license" images and download one, there are plenty of samples available)
- create a dotnet console project
- read the image from the disk
- send it to Form Recognizer using the REST API
- extract the details from the results
I'll make the first step even easier by providing you with this link, just grab the image and save it somewhere on your PC (I saved it in D:\license.jpg). Once you did this and you created your console project, let's start writing some code.
We'll first need to read the image from the disk:
var image = System.IO.File.ReadAllBytes(@"D:\license.jpg");
Once we have the image bytes, we can send them to Azure using a POST request like this:
The Form Recognizer API will return 202 if everything went well along with a header that contains the URL we can use to verify the status of the request. This header is called "Operation-Location", and we'll have to make GET requests towards it until the request finished processing with success or it encountered an error. This is how we do it in our app:
Once the status is "succeeded" we can exit from the loop and see if we have any results. The IdentificationResult object is a representation of the JSON in C# objects, you'll find it in the repository for this sample. Bear in mind that a successful status doesn't mean that it detected a document, it just means that the scan was successful. We'll have to check the response in order to see how many documents it detected, and what fields are available. You can see that I checked if the DocumentResults object is not null and that it has at least one item in it. This means that we have fields from at least one document and then I displayed a few of them along with their confidence.
I'm sure that the ID service will be included in the official Nuget package for Form Recognizer, meanwhile you can use the REST API like I described above.
You can also find the complete sample in the repository below. Enjoy!