arrow_back

The Holy Grail of AI Model Orchestration

Daan Vermunt

Aug 2024

In our previous blog, we explored how small AI models can outperform larger ones in terms of accuracy and control. Today, we’ll dive deeper into how these smaller models can be orchestrated into a full AI solution, powering document processing infrastructure and model pipelines.

The industry is increasingly shifting away from relying on one large, monolithic AI model to do everything. Instead, companies are turning to smaller, specialized models designed for specific tasks. The future of AI lies in the collaboration of these models—each performing a distinct role, yet working together to accomplish a larger goal. As this trend grows, so too will the need for a robust system to define, manage, and orchestrate these models efficiently.

This vision aligns with current AI development trends, where text-based interactions are increasingly managed by specialized models. Rather than using a single large model to generate a response, companies are parsing messages to identify the nature of the task—whether it’s mathematical, linguistic, or otherwise. For instance, when a message is identified as a mathematical query, it's routed to a specialized mathematical model designed for high accuracy in such tasks. The result is then processed and communicated back through a language model.

But what models should we use? How do they work together, and what tasks can they perform? Let’s explore.

The Instruments of Orchestration

To create a robust AI-driven document processing pipeline, we need to understand the types of models and the steps involved. This is called a pipeline because of the sequential steps documents undergo, with the ultimate goal being to keep the documents flowing smoothly and efficiently through each stage.

Types of Steps

Transformers: These models modify the properties of documents, like extracting data or increasing contrast. They are crucial for tasks such as extracting relevant information from documents and classifying data. They can also handle visual adjustments like contrast enhancement or image rotation.
Splitters: Since models always operate on a certain context, e.g. one single receipt, or a single mathematical formula, splitters are used to move to that unit of work. Splitters take a single input and produce one or multiple outputs. They’re crucial for separating data into chunks of relevant context
Routers: Routers introduce flexibility by directing documents to different parts of the pipeline. They are vital for dynamically moving tasks to the corresponding tasks within the pipeline.

By combining these models in a way that suits your process, you can create a tailor-made solution for your document processing needs. These models serve as the building blocks for your infrastructure.

Example Steps in the Pipeline:

‍

Transformers

Entity Extraction
Contrast Enhancement
RAG-Based Entity Extraction
Document Summarization
Text Vectorization
Multi-Modal Entity Extraction
Sentiment Analysis
Document Anonymization
Document Redaction

Splitters

Model-Based Segmentation
Page Splitting
Page Grouping

Routers

Classification-Based Routing
Rule-Based Routing
Vision-Based Classification
Sentiment-Based Routing
Metadata-Based Filtering

‍

Documents move through these steps, each one contributing to the final output. However, what happens after each step? The models can produce different outcomes or 'side effects.'

Side Effects

Data Extraction: As documents flow through the pipeline, you can attach packages of extracted data to them.
External Effects: The pipeline can interact with external systems, such as making API calls to retrieve additional data or trigger external actions.

Managing the Complexity

Managing the complexity of these pipelines can be guided by the principles of UNIX, which emphasize simplicity, modularity, and interoperability.

Six Key Principles:

Do One Thing and Do It Well: Each block (or model) focuses on a single task and optimizes for it.
Write Programs That Work Together: By defining strict rules on what each step can do, you ensure that the order of steps doesn’t matter, enabling seamless collaboration between steps.
Use a Unified Document Model for Data Interchange: Employing a single model (format/schema) for passing data between steps allows for easy chaining of tasks.
Design for Flexibility and Extensibility: By setting up the system as a pipeline of steps, it becomes flexible and easy to extend with new functionalities in the form of additional steps.
Avoid Unnecessary Complexity: Each step’s model is straightforward, with complexity abstracted away into the blocks themselves. While the block might be complex, connecting them is simple.
Strive for Model Interoperability: It should be easy to upgrade models, change providers, or switch clouds within a single step without disrupting the pipeline.

Taking all six UNIX principles into account within our approach. The processing of hypercomplex, high-volume document streams is possible by making sure all steps are experts and all experts can always work together as part of a grander solution. The UNIX approach creates a flexible, yet robust workflow tailored to your business operation, to be able to assure the accuracy required.

The Conductor: Orchestrating the Models

Understanding the building blocks of a document processing pipeline is just the beginning. The true power lies in effectively orchestrating these models. While current model orchestration is often managed manually, the future points toward AI-driven orchestration, where specialized models work together to accomplish complex tasks with greater efficiency.

At Send AI, we’ve developed a robust document processing infrastructure that empowers customers to orchestrate their own customized pipelines. By creating a network of steps and allowing users to control them, we ensure that models collaborate seamlessly while users retain ultimate control.

I envision a future where AI systems are composed of a network of specialized models, each responsible for a specific task. As AI evolves, the emphasis will shift from relying on a single, all-encompassing model to a more modular approach. This approach not only enhances accuracy but will ultimately also simplify the process of achieving the desired outcome.

As this trend continues, I am confident that there will be a growing need for a system that can define and orchestrate these tasks, ensuring that all models work in harmony. The future of AI will not be ‘the bigger the better’ in terms of models. The future of AI lies in the seamless integration and orchestration of these specialized models, with a system in place to manage and coordinate them effectively. This shift will not only improve efficiency but also pave the way for more sophisticated and reliable AI-driven solutions.

‍

Orchestration is the key—and the future of orchestration will be AI.

‍

Product

Solutions

The Holy Grail of AI Model Orchestration

The Instruments of Orchestration

Managing the Complexity

The Conductor: Orchestrating the Models

Start your AI journey today