Filling Forms from Images in React Admin

Jonathan ARNAULT & Thiery MichelFebruary 21, 2025

#react#react-admin#ai

Filling forms can be tedious for users, especially when performing error-prone tasks such as copying physical documents. Fortunately, the recent shift from Large Language Models to Vision Language Models has enabled intelligent data extraction from images. This means we can leverage the advances in vision models to improve user productivity and reduce copying errors.

In this article, we'll introduce a new react-admin component called <FormFillerButton />, part of the ra-ai Enterprise Edition package. This component fills its parent form with data extracted from an image or the camera, and it works on any form in a react-admin application.

Using `<FormFillerButton />` To Fill Forms

A good example of using the <FormFillerButton /> is the contact creation page in a CRM. If the user has a business card for that contact, they can take a picture of it, and the <FormFillerButton /> will fill the form with the extracted data.

Use <FormFillerButton /> as a descendent of any react-admin form to render a form filler button.

import { Create, SimpleForm, TextInput } from 'react-admin';
import { FormFillerButton } from '@react-admin/ra-ai';

export const CreateContact = () => (
	<Create>
		<SimpleForm>
			<FormFillerButton />
			<TextInput source="firstName" helperText={false} />
			<TextInput source="lastName" helperText={false} />
			<TextInput source="company" helperText={false} />
			<TextInput source="email" helperText={false} />
		</SimpleForm>
	</Create>
);

That's it! It only requires a single line of code to enable AI-based filling on a form.

The <FormFillerButton /> abstracts away most of the complexities of form filling with vision models: camera recording, image capture, and prompt generation.

Vision Model Integration

<FormFillerButton /> calls the vision model via a new data provider method, dataProvider.generateContent(), passing the prompt for the form, the attachments, and the configuration for the model. This adapter approach allows you to use any model with vision capabilities, including, but not limited to:

OpenAI's GPT-4o and GPT-4o mini ;
Anthropic's Claude 3.5 Sonnet and Opus ;
Google's Gemini 2.0 Flash and Gemini 2.0 Flash-Light ;
Mistral AI's Pixtral 12B and Pixtral Large ;
Meta's LLama 3.2 11B and 90B ;
DeepSeek's VL.

The ra-ai package comes with an adapter for OpenAI, called addAIMethodsBasedOnOpenAIAPI:

import { addAIMethodsBasedOnOpenAIAPI } from '@react-admin/ra-ai';

const baseDataProvider = newDataProvider();

export const dataProvider = addAIMethodsBasedOnOpenAIAPI({
    dataProvider: baseDataProvider,
});

Check the Data provider setup in the ra-ai documentation for more details.

Improving the Prompt

The <FormFillerButton /> automatically detects the names and types of the form fields and uses them to craft a prompt for the vision model.

If your form uses exotic input names, or if you want to help the vision model to understand better how to find the correct information for each field, you can provide hints using the fields prop:

<FormFillerButton
    fields={{
        cmpny:
            `The company name does not include any suffixes.
			If not present, try to extract it from the email domain name.
			Example: Acme`,
        email:
            `User email. Must be a valid email address.
			If more than one email is present, find the one @acme.com`,
    }}
/>

Using a Custom Prompt

Sometimes, the field hints may not fit your needs as you may require a strict format for your completions or want to use a custom prompt. To solve this problem, you use a custom prompt enforce response schema using the zod library and GPT 4o mini completions response_format helper.

The example below demonstrates how to define your own generateContent method to customize completion with a specific prompt and an enforced schema:

import { default as OpenAI } from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";

import { type GenerateContentParams } from '@react-admin/ra-ai';

export const openAiClient = new OpenAI({
	apiKey: "your api key",
});

const formSchema = z.object({
	firstName: z.string(),
	lastLame: z.string(),
	company: z.string().nullable(),
	email: z.string().nullable(),
});

export const dataProvider = {
	// ... other data provider methods

	generateContent: async ({ attachments = [] }: GenerateContentParams = {}) => {
		const attachmentUrls = await Promise.all(
			attachments.map(file => convertToBase64(file))
		);

		const completion = await openAiClient.beta.chat.completions.parse({
			messages: [
				{ role: 'system', content: "<your system prompt>" },
				{
					role: 'user',
					content: [
						{ type: 'text', text: "<your prompt>" },
						...attachmentUrls.map(url => ({
							type: 'image_url',
							image_url: { url },
						})),
					],
				},
			],
			response_format: zodResponseFormat(formSchema, "event"),
			model: "gpt-4o-mini",
		});

		return { data: completion.choices[0]?.message?.content };
	}
}

Supported Sources

<FormFillerButton /> currently supports two input types:

image: performs recognition using a local file ;
camera: relies on the user device camera to scan the document.

You may want to disable some of these sources depending on your use case. You can use the sources property to customize them. Here is an example restricting the sources to local images only:

<FormFillerButton sources={['image']} />;

Perspectives and Limitations

The accuracy of the filled values depends on the performance of the vision model used by the generateContent method. The provided OpenAI adapter with ra-ai is based on the GPT 4o mini model, providing great completions and vision capabilities overall. Yet it's not perfect; it can hallucinate or make mistakes. In practice, filled values are mostly accurate, but a human should always review the form before saving it.

The <FormFillerButton /> is designed to improve the productivity of your users by automating a time-consuming task: copying documents into a form. Its objective is not to replace your users. The filled values must be reviewed before being saved to avoid mistakes or hallucinations.

Although the basic usage of the <FormFillerButton /> is straightforward, the component is still highly configurable. As stated before, you can define the sources and field hints, but you can also decide to override existing values or specify the maximum size of an image that will be sent to the vision model. You can refer to the <FormFillerButton> documentation for all available options.

The vision model needs to access the image and the form layout to infer the data to be filled. This means that the data needs to be sent to the AI API. This is a security concern, especially if the data is sensitive.

Furthermore, the provided OpenAI adapter forces you to add an API token in the browser's local storage to perform completions. For public websites, this is a serious security flaw, as anyone can steal your token. For such apps, we recommend using a proxy server to call the completion API and keep the token on the server.

Usually, vision models can be expensive. However, this component only makes one completion request per scanned document. Based on our experience with the GPT 4o mini model, we found that filling out a form costs less than a penny. For a form with 10 fields and using high-resolution images, according to the official Open AI Vision documentation, we estimate that 350 to 400 forms can be filled for a dollar, but your mileage may vary.

Note that it is possible to downscale the input image and therefore reduce the number of input tokens using the <FormFillerButton maxDimensions/> prop.

When you deploy these new features, you should monitor the number of calls to the AI API and add a spending limit to your account to avoid surprises.

Conclusion

The <FormFillerButton /> is part of ra-ai, a react-admin Enterprise Edition package. This component is now available to all React Admin Enterprise Edition customers.

The advances in Vision Language Models enable new use cases for B2B apps and admins. A good starting point is this component that enhances the user experience by automating a time-consuming task: copying documents into a form. Give it a try, and let us know what you think!

We have other sources in mind to help you fill out your forms and increase your productivity, so stay tuned!

We were unable to confirm your registration.

Your registration is confirmed.

Did you like this article? Share it!