Automating routine office tasks with AI doesn’t have to be complicated. Learn how we built a simple solution for AI invoice processing using Vertex AI and App Script.
A year ago, we conducted research with over 1,000 digital leaders and found that many businesses are eager to integrate artificial intelligence into their operations but often don’t know where to start. While generative AI is a powerful technology, that doesn’t mean its use is reserved only for groundbreaking innovation. If you want to find a use case for it, why not start small?
It’s precisely what we did at our Montenegrin office. By using existing tools and a little creative thinking, we successfully leveraged artificial intelligence to streamline a repetitive task every office encounters – invoice processing.
The invoice-processing challenge – a case for automation
Every office is met with certain tedious, manual tasks that demand time and attention, and if you’re aiming to optimize processes through automation, these tasks are the ideal place to start.
Invoice processing is a perfect candidate. Most offices receive multiple invoices daily, and handling them can quickly become a distraction. We’d all rather focus on more creative work or learning something new.
Here’s how the process used to work in our Montenegrin office:
- Receive invoices in either paper or digital format
- If the invoice is printed, digitize it
- Rename the file so it’s easily searchable, including the sender’s name, invoice date, and amount
- Upload the invoice to Google Drive, organized by month and year
- Create a payment order
Performing these steps for every single invoice takes time, and to make things even more complicated, invoices come in many shapes and sizes. Different companies use different invoice formats, and the relevant information is often found in completely different places. That’s why AI invoice processing offers a great solution.
To tackle this challenge, we developed a simple script using Vertex AI and App Script. Now, we simply forward any invoice we receive to a designated email address, and the process is automated from there.
AI invoice processing with Vertex AI – no need to reinvent the wheel
First of all, you might wonder if Optical Character Recognition (OCR) would be enough for invoice processing automation. OCR is a technology that “reads” scanned paper documents or images taken by a digital camera, converting them into editable and searchable data. However, while we can extract the data from the file with OCR, we can’t identify individual variables, such as vendor details or invoice amounts.
This is where natural language processing comes in. We can pass the data extracted by OCR to a Large Language Model (LLM) and prompt it to find the information we need. You may think that this would require a specially trained model, but unless you receive thousands of invoices daily, there’s a fast and inexpensive out-of-the-box solution that does the trick.
The perfect fit for our use case was Vertex AI – Google Cloud’s unified platform that simplifies access to powerful LLMs, OCR capabilities, and multimodal AI solutions. In combination with App Script, it offers a powerful way to automate tasks.
Note: We’ve chosen these tools because we use Google Workspace, but the same principle could easily be applied to a different ecosystem. For example, you could go with Microsoft Power Automate and OpenAI API if you use Microsoft 365.
Creating the script
The first step was to create an App Script with a trigger that runs every hour. If you would like to track your script on git, we recommend using clasp (Command Line Apps Script Projects), a tool that helps you work on Apps Script projects locally.
For this purpose, we created a special email list, montenegro.invoices.import@infinum.com. All incoming invoices should be sent as attachments to this list. The hourly script we’ve created will search for each email thread sent to it. For increased security, we can also limit our search only to emails sent from specific email addresses. That way, we’ll only get invoices from senders we know and trust. In this example, we’ll use the address approved.sender@infinum.com.
function forwardInvoices() {
const threads = GmailApp.search('from:approved.sender@infinum.com to: montenegro.invoices.import@infinum.com AND NOT label:done-✅')
…
}
We should then check for any messages in the thread:
for (var i = 0; i < threads.length; i++) {
var thread = threads[i];
if (thread.getMessages().length == 0) {
Logger.log("No messages in thread")
continue
}
//there is a message
}
In the next step, we check if the message contains an attachment.
var attachments = message.getAttachments()
if (attachments.length == 0) {
Logger.log("There is no attachment ")
replyError(`Please send an email with an attachment`, thread)
} else {
Logger.log("Wohoo!There is an attachment")
..
}
We save the script and run it. If there are emails matching the conditions, the logs will show the message “Wohoo! There is an attachment.”
When the script finds an invoice in the attachment, it sends it to Vertex AI through the API, so we can use a prompt to analyze it.
Using the Vertex AI API
First, you will need to visit Google Cloud Console and create a new project where you will enable the Vertex AI API and App Script API. When you create the project, you should go to Vertex → Vertex AI → Vertex AI Studio → Multimodal.
The good thing about Vertex AI is that it has a user-friendly interface and provides a great environment for testing your prompts on test files. You can also choose from different models – in our case, we used gemini-1.5-flash-001. Once you are happy with the results, you can click Get code and get an example request that replicates what you did in the UI. Examples are available for Python, Node.JS, Java, and CURL. There’s no option for App Script, but Node.JS will be helpful in our scenario.
A URL for the request to the Vertex AI should look something like this:
API_ENDPOINT = "us-central1-aiplatform.googleapis.com"
PROJECT_ID = "someProjectID"
LOCATION_ID = "us-central1"
MODEL_ID = "gemini-1.5-flash-001"
var url = `https://${API_ENDPOINT}/v1/projects/${PROJECT_ID}/locations/${LOCATION_ID}/publishers/google/models/${MODEL_ID}:g enerateContent`;
Here, projectID is the ID of your Google Cloud Console project.
To be able to pass the file to Vertex AI for analysis, we will need to convert our PDF file to Base64.
// Read the file from attachments
var blob = attachments[0].copyBlob().getAs(MimeType.PDF);
var pdfBase64 = Utilities.base64Encode(blob.getBytes());
Then, we should create an API request:
var jsonRequestBody = getGeminiRequest(pdfBase64)
…
function getGeminiRequest(pdfBase64) {
return {
contents: [
{
role: "user",
parts: [
{
inlineData: {
mimeType: "application/pdf",
data: pdfBase64
}
},
{
text: `You are a document entity extraction specialist. Given a document, your task is to extract the text value of the following entities:
COMPANY_NAME is the name of the company issuing the invoice, and it should not be Infinum d.o.o.
YYYY is the year of the invoice transaction in the format YYYY
MM is the month of the invoice transaction in the format MM
DATE_ISSUED is the date of the invoice issuance in the format DD.MM.YYYY (German format)
TOTAL_AMOUNT_TO_PAY is in the format used in Germany without currency symbol or sign (examples "0.000,00" or "00,00", or "0,00" ) and it should never be in format "00.00"
- The values must only include the text found in the document, but you can format it
- Do not normalize any entity value
- If an entity is not found in the document, set the entity value to null.
- Give just one answer and format it: COMPANY_NAME-YYYY-MM-DATE_ISSUED-TOTAL_AMOUNT_TO_PAY`
}
]
}
],
generationConfig: {
maxOutputTokens: 8192,
temperature: 1,
topP: 0.95
}
};
}
This API request JSON object contains two important things: the file we want to analyze and the prompt that helps the AI model return the right result.
It’s very important how you prompt here. To extract the information you need regardless of the invoice format, you must be clear and concise. We need to tell the AI model what role it should play and be precise about the information we expect to get and its format.
Another part of the URL request is options.
var options = getGeminiOptions(jsonRequestBody)
function getGeminiOptions(jsonRequestBody) {
return {
method: 'post',
contentType: 'application/json',
payload: JSON.stringify(jsonRequestBody),
headers: {
Authorization: 'Bearer ' + ScriptApp.getOAuthToken()
},
};
}
As you can see from the code, we are using ScriptApp OauthToken. Only authorized users can access Vertex AI API.
Vertex AI API response
Let’s create a request:
var response = UrlFetchApp.fetch(url, options);
var responseData = JSON.parse(response.getContentText());
Logger.log(responseData)
var combinedText = getInvoiceNameFromResponse(responseData)
…
The model will return tokens we will need to combine:
function getInvoiceNameFromResponse(responseData) {
if (!responseData.candidates || !Array.isArray(responseData.candidates)) {
throw new Error('Invalid response structure: candidates is missing or not an array');
}
for (let candidate of responseData.candidates) {
if (candidate.content && Array.isArray(candidate.content.parts)) {
for (let part of candidate.content.parts) {
if (part.text) {
return part.text.trim();
}
}
}
}
throw new Error('No text found in the response');
}
Storing the file to Google Drive
With the procedure above, we get the name of the file, and we can save it on Google Drive under the same name.
blob.setName(`${combinedText}`)
var googleDriveFolder = DriveApp.getFolderById(DIR_ID_INVOCIES)
var file = googleDriveFolder.createFile(blob)
Letting people know about the result
You can use the script to automatically reply to the email thread:
thread.replyAll(``, {
name: 'Mladen Rakonjac',
htmlBody: `AUTOMATIC RESPONSE: Invoice <b> ${copySubject} </b> is processed successfully.`
})
Finally, you’ll want to mark the emails that have been processed:
const LABEL_DONE = "done-✅"
thread.addLabel(GmailApp.getUserLabelByName(LABEL_DONE))
The full script is available on GitHub.
Setting the trigger
You can easily set the interval for your script by configuring the trigger. After saving the script, go to the left-side menu in App Script, where you’ll find Triggers. Click the blue Add trigger action button, and choose the function you want to run by selecting forwardInvoices. Here, you set your time-based trigger, for example, for running the script every hour. Once you save the trigger, you’re all set.
AI invoice processing in action
Let’s demonstrate the process in a real-world example.
This is an invoice we received:
This is the email containing the invoice sent by our colleague in HR:
When the script runs based on the trigger we set up earlier, it finds and processes the invoice, and we get an automatic response confirming the action.
This way, the person who forwarded the email to the invoices email list knows that the invoice has been processed successfully.
In addition, we also used the script to forward the invoice to our accountant:
As you can see, the email subject contains the invoice details that the AI model extracted for us. So, if we want to search for an invoice in the future, we can do so easily through Gmail or Google Drive.
And that’s it! All the boring work is done automatically.
So far, the process has worked well for us. There is one condition, though: the PDF files must be of good quality. Sometimes, our AI invoice processing script makes some mistakes, but so do humans. If that happens, we handle the invoice manually. In any case, the success rate is more than satisfactory, and we spend far less time thinking about processing invoices.
Want to do more? Use your creativity for advanced AI invoice processing
The process described above is just an example of an everyday office task you can easily automate to relieve your colleagues of repetitive, tedious work. You can also build up on it and customize the script to match your needs. For example, you can get it to recognize if the invoice is sent by a colleague who needs a travel expenses refund and store the information in a Google Sheet. With a little creativity, the possibilities are endless.
Most importantly, this is a low-effort solution that requires very little development time yet makes your life a lot easier. Not all AI solutions have to be revolutionary. If we’ve freed up our time to focus on more strategic tasks, it’s an AI-powered win.
If you want to leverage AI to optimize processes within your organization, find out how we can help.