In this blog, I will show you how you can build your own custom extraction model using the Document Intelligence Studio. If this is your first time looking at the Document Intelligence Studio I recommend looking at this blog first: First look at Document Intelligence Studio
Blogs
This blog is part in a series of blogs about the capabilities of the Azure Document Intelligence resource.
Requirements
- Azure ‘Document Intelligence’ or ‘Azure AI services multi-service account’ resource
- Azure ‘Storage account’
- Minimal of 6 Documents (5 training documents and 1 test document)
Project setup
Open the custom extraction model page in Document Intelligence Studio. Scroll down the page and select Create a project in the My projects list.
This will open a wizard with four steps for setting up your project.
- Provide a name and description for the project.
- Select the Document Intelligence resource and API version you want to use.
- Select the storage account, container, and folder to store the training data.
- Review and create your project.
Analyze documents
Upload the files you want to use to train your model. (1) The first time you upload documents you get the option to run layout or auto label for all documents. Select Run layout on this screen or use the Run layout button for each document you uploaded (2). When you use the Free tier (F0) of the Document Intelligence service, it can happen that not all your documents can be analyzed at once because of the rate limit. In that case wait for the appointed time and try again, or upgrade to a paid tier.
In the middle of the screen you can view the selected document. (3) You can see all the pages, search for text in the document, zoom in and out and rotate it accordingly.
Fields
With all files uploaded and analyzed we can start defining the fields we want to extract. Press the ‘Add a field’ button to create your first field. (4)
There are 4 different types of fields.
- Field: A text field
- Selection mark: A true or false value based on a checkmark.
- Signature: An image from within a bounding box
- Table : A dynamic or fixed table
A Table can be customized by selecting the table field after creation on the right side of the screen. You can rename, insert, and delete a column or row by using the menu under the chevron.
Training
When you are happy with the fields you created you can assign values to your fields. Analyzed documents have yellow highlighted texts. You can select these texts and specify the field. Do this for each field of each document.
When you want to train a signature or a selection mark field you can use the Draw region option and draw the bounding box in the correct location on the document.
Tables can be trained in 2 different ways.
Manually: by selecting the text that you want to store in a specific cell of your table. Then select the cell you want to store it in.
Automatically: if the layout has detected a table in your document you can use the table icon to open a wizard that helps you auto label your table using a wizard.
When all fields of all the training documents are assigned, you can train your model by selecting the train button in the top right corner of the screen.
Be patient, training a model can take a little while.
You can see the status of your model by navigating to the Models tab from the menu on the left of the screen.
Testing
When the model is ready you can test it using the Test (1) option in the menu. Upload the file you want to use to test your model (2), Do not use the same files you used to train the model. Select Run analysis (3) on the file. On the right side of the screen you will see the results of the analysis. (4). On the first tab Fields you will see your fields and the values it extracted. It also provides you with a percentage. This represents the probability that this is correct. The more documents you use to train a model, the more accurate it will be. On the Result tab you will see the same values but in Json format. And Code will give you examples of how to call the model using Python, C# or JavaScript.
Composed models
You can combine multiple custom models in one endpoint by creating a composed model. This can be helpful if you have versions of the same document that are too different or if you do not know what type of document will be provided. The composed model will first classify the document to determine which models to use before analyzing the document.
To do this you first need to train all your custom models individually. For easy maintenance make a separate project per model. From the Models tab select all the models you want the composed model to contain. Then press the Compose button and give your composed model a name and description.
When you test your composed model you can see what model was used to extract the data by looking at the DocType in the results pane.