Integrations
This rantir workflow offers an effective approach to parsing bank statement PDFs using multimodal LLMs, which outperforms traditional OCR by providing accurate data extraction, especially for tables and complex layouts.
Advantages of Multimodal Parsing over Traditional OCR:
- Reduces complexity and overhead by avoiding text pre-processing before sending to the LLM.
- Handles non-standard PDF formats that may produce errors with traditional OCR.
- Is significantly more cost-effective than premium OCR models, which often require further cleanup.
How it Works
- The bank statement PDF is imported from Google Drive. This example uses a mock statement with complex 5-column tables that OCR struggles with.
- Since multimodal LLMs do not accept PDFs directly, the PDF is converted to images using Stirling PDF. This tool is self-hostable, ensuring privacy for sensitive data.
- Stirling PDF returns the PDF as a series of JPGs (one per page) in a zip file. The rantir workflow decompresses and sorts these images in the correct order.
- Each image is resized with the Edit Image node for optimal balance between resolution and processing speed.
- The resized images are passed to the Basic LLM node, which uses the multimodal LLM (e.g., Gemini 1.5 Pro). A "user message" of binary type is added as input to process each image.
- The prompt instructs the LLM to transcribe each page to markdown for clarity. Alternatively, you can prompt for specific data points directly.
- The markdown version of each page can then be analyzed by another LLM node to extract data, such as deposit line items.
Requirements
- Google Gemini API for multimodal LLM processing.
- Google Drive for document storage.
- Stirling PDF for PDF-to-image conversion.
Customizing the Workflow
- Gemini 1.5 Pro is optimal for text document parsing, but other multimodal LLMs like OpenAI GPT or Anthropic Claude can also be used.
- For faster results, skip markdown formatting and directly request data extraction from the LLM.
- This template is versatile and can be adapted for invoices, inventory lists, contracts, legal documents, and more.
Other Workflows like this one
Your connected stack awaits to automate AI workflows with 24-7 uptime performance and engagement
Custom LangChain agent written in JavaScript
AI Agent
LangChain Code
OpenAI Chat Model
OpenAI Model
Edit Fields (Set)
Daily Podcast Summaries
HTTP Request
Gmail
Code
OpenAI
HTML
Customer Support Channel and Ticketing System with Slack and Linear
Slack
Linear
Basic LLM Chain
OpenAI Chat Model
Structured Output Parser
Summarize Umami data with AI and save it to a database
HTTP Request
Baserow
Code
Sticky Note
Manual Trigger
AI-Powered Children's Books on Telegram or Whatsapp with OpenAI
Telegram
Summarization Chain
OpenAI Chat Model
Recursive Character Text Splitter
OpenAI
AI: Summarize podcast episode and enhance using Wikipedia
Gmail
Item Lists
Code
AI Agent
Summarization Chain
Compare features across plans
Computir Cloud Suite All Access
$99/m
Per team/per month, with 10 GB of data and storage
Everything in Free, and:
Host up to around 4-5 Applications
Advanced user roles
Unlimited AI applications & workflows
Custom onboarding & Customer management
Advanced integrations
International capabilities
Unlimited Team Plan & Custom Integration
$299/m
Per $1K Tokens or 1 TB added, custom integration (per month)
Everything in Professional, and:
Host up to around 20+ Applications
Tailored implementation services
Advanced ERP integration capabilities
Extra bandwidth and open-source AI models
Fine-tuning & data logic
SOX or integration customization
Dedicated premium support
Computir Cloud
AI Application & Automation platform suite
Get access to generate dashboards, websites or content
Chat to Explore Data
Custom Develop integrations
Chat to Transform Data
Direct or Enterprise application connections
Webflow, Wix or Wordpress
+ Acumatica, Microsoft, Netsuite & Sage
+ Oracle & Workday
Rules to automate AI
Basic
Advanced
Advanced
Custom Integrations
Build & Share Live Reports
Generated
Human-Led
Train Classification Models
Human-Led
Train Time Series Forecasts
"I highly recommend Computir, they are a great dev team with quick turn around on all projects and requests. We recently worked with them on updating our website and any changes, updates or modifications I needed were always taken care of quickly!"
Paige J, VP of Marketing, Heavy AI