en
Choose your language

Internal AI Deployment for Seamless Content Translation: A Real-Life Project Story

(If you prefer video content, please watch the concise video summary of this article below)

In today’s multilingual work environments, fast and reliable localization is essential. At SaM Solutions, we recently tackled the challenge of translating a large volume of internal content into English by deploying an AI-driven solution. This case study details how our team leveraged a locally hosted large language model (LLM) to automate translation directly within our content management system (CMS), achieving efficiency, cost savings, and full data control.

Leverage AI to transform your business with custom solutions from SaM Solutions’ expert developers.

The Business Need

As SaM Solutions continues to grow internationally, the need for multilingual communication across departments has become more pressing. Our internal portal, powered by Umbraco CMS, serves as the central hub for news, articles, and corporate updates. With over a thousand pages of content in different languages (English, German, Polish, Lithuanian, etc.), we faced the operational challenge of ensuring this material would be accessible to all employees, regardless of the source or target language. This required a scalable solution for cross-translation between all corporate languages.

Manual translation was evaluated but quickly ruled out due to the volume of material and time constraints. We needed an automated solution that could be integrated directly into our corporate portal, preserve data privacy, and ensure quality outputs with minimal manual intervention.

Why We Chose a Locally Deployed LLM

Cloud-based translation services were not considered due to concerns over data confidentiality and ongoing subscription costs. Instead, we opted for a self-hosted LLM deployment. Here are some key benefits of this approach.

Data security

All processing occurs on internal infrastructure with no third-party exposure, preventing critical data leakage.

Data security

Lower long-term costs

After the initial on-premises setup, there are no recurring licensing fees and API subscription payments. You can use the model on a long-term basis after a one-time deployment.

Lower long-term costs

Flexible configuration

You can configure every aspect of the system to match your specific needs. It supports locally deployed AI models as well as integrations with external providers like OpenAI, allowing you to choose, combine, or switch between models based on your tasks and infrastructure.

Flexible configuration

Intranet-based solution

The model runs entirely within the corporate intranet, providing fast, reliable access for internal teams without requiring an internet connection. This setup aligns with strict network policies and supports secure, uninterrupted operation across internal systems.

Intranet-based solution

Full integration

A locally deployed AI model can be connected directly to the company’s internal applications through the development of Model Context Protocols (MCP), ensuring the integration of different workflows.

Full integration

Custom tuning

The system is fully configurable to adapt to evolving content and language needs.

Custom tuning

Fine-tuning

With local deployment, we gain full control over the training process, enabling precise fine-tuning of the language model on our domain-specific data. This allows the AI to better understand internal terminology, writing style, and context-specific phrasing.

Fine-tuning

Selecting the Right Model

We evaluated several open-source LLMs with different quantization levels and various versions, including Qwen, Deepseek, Mistral Nemo, Phi4, and Gemma. Each was tested using the same prompt on a representative sample of articles. Our evaluation team assessed output quality across several dimensions: linguistic accuracy, tone consistency, handling of markup and abbreviations, and overall usability.

Gemma 3 emerged as the clear winner. It provided the most consistent and coherent translations and required the least post-processing. Based on these findings, we moved forward with Gemma as the foundation of our localization pipeline.

Technical Implementation

Architecture

We designed a modular and resilient architecture to integrate AI-powered translation seamlessly into our existing CMS infrastructure. The system identifies untranslated documents by scanning Umbraco metadata for content types such as News and Articles. Each qualifying document is assigned a discrete translation job, ensuring traceability and isolation of processing.

How LLM model functions

A conditional publishing flow was established:

  • News items (in case they are published in the original language) are published automatically upon successful translation.
  • Articles are held in an unpublished state and submitted for manual review by designated editors. The reason is that articles are typically longer and often include specialized terminology (industry-specific or unique to our company), requiring a higher level of translation accuracy. This is especially important for corporate policies, ISO documentation, and other sensitive materials. 

This approach balances automation with quality assurance, allowing rapid content delivery while maintaining editorial oversight where necessary.

Key tools

We selected Hangfire, a robust job scheduling library for .NET, to manage translation workflows. Hangfire provides:

  • Reliable background job execution
  • Retry logic for failed tasks
  • A built-in UI dashboard for monitoring and managing job status

To ensure secure and convenient access, we embedded the Hangfire dashboard directly into the Umbraco CMS interface and configured it with internal authentication controls.

To tailor Hangfire to our specific needs, we introduced several key customizations:

  • Extended logging capabilities: We integrated a third-party logging library with Hangfire to enable detailed monitoring and easier debugging of background tasks.
  • Task management extension: We developed additional functionality that allows us to manually add or restart specific tasks (such as translation jobs) directly within Hangfire. These controls were seamlessly embedded into the Hangfire dashboard, giving us better control and visibility over our job queue.

Translation jobs can be scheduled to run during off-peak hours to minimize resource contention and avoid disruptions to other internal processes. This allows us to maintain system performance while processing large volumes of content efficiently.

Overcoming Challenges

Throughout development and testing, several practical challenges emerged. We addressed each of them with targeted engineering decisions.

Handling long documents

Large texts occasionally exceeded the model’s optimal input length. To ensure stability, we implemented a segmentation mechanism that breaks content into chunks not to exceed the selected model context window.

Handling long documents

Managing complex formatting

Articles containing intricate markup or embedded HTML tags sometimes led to hallucinated or malformed output. Although such cases were rare, they prompted us to implement a post-translation validation step. This step checks formatting integrity and ensures consistency. If validation fails, the system automatically generates a log entry, flagging the content for manual review.

Managing complex formatting

Abbreviation detection

Certain content comprised short strings of characters, such as acronyms or product codes, that do not require translation. A pre-processing filter was added to bypass these cases.

Abbreviation detection

Prompt tuning

While prompt quality is critical to translation fidelity, we found that even well-optimized prompts could yield unpredictable results. We continue to refine prompts based on observed edge cases.

Prompt tuning

Retry logic

If a translation attempt fails or returns incomplete content, the job is automatically retried up to three times. Failures are logged for diagnostics.

Retry logic

Post-processing checks

Completed translations are scanned for indicators of failure, such as mixed-language output or untranslated segments. These are flagged for manual review to ensure quality control.

Post-processing checks

Performance Metrics

During the initial rollout, the system demonstrated solid performance and processing consistency:

  • Initial batch processed: more than 1,300 documents successfully translated
  • Average translation time per document (under 1K words): approximately 30 seconds
  • Large documents (over 1K words): typically completed in around 2 minutes
  • Rare outliers: peak durations between 5 to 6 minutes
  • Operational efficiency: Translation time for the full corpus was reduced from an estimated 2.5 weeks of manual work to a few hours.

This level of performance met our expectations and confirmed the feasibility of ongoing automated localization for internal content workflows.

What’s Next?

To further streamline content localization, we are planning to develop a dedicated plugin for the Umbraco CMS. This plugin will introduce a “Translate with AI” button directly into the editorial interface, allowing users to initiate translation tasks with a single click.

The solution will support both locally hosted and external LLMs, giving editors the flexibility to select the most suitable engine for their needs. Once completed, we plan to release the plugin to the broader community via the official Umbraco marketplace.

Summing Up

This project demonstrates how AI can be deployed responsibly and effectively to solve practical business problems. By combining careful model selection, strong system architecture, and thoughtful integration into existing workflows, we delivered a secure and scalable solution that improves our internal operations while preparing us for future localization demands.

Consulting on LLM deployment project
Need to tackle a similar challenge?

For organizations facing similar challenges with corporate content translation and localization, locally deployed AI models offer a powerful alternative to traditional translation methods, balancing autonomy, control, and performance in one integrated solution.

Andrey Kopanev, Senior .NET Developer, AI Enthusiast

Please wait...
Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>