About Us

Engineering
the building blocks of LLM inference

We build and operate LLM inference infrastructure for teams building AI products that stay stable under real-world load.

Built for builders,
by builders.

Entrim is run by engineers who care about predictable behavior under load, cost transparency, and data privacy. We're the infrastructure team so you don't have to be.

What we do

Operate production-grade LLM inference infrastructure

Provide OpenAI-compatible API for open-source models

Run models on our EU datacenter infrastructure

Maintain availability during traffic spikes and long generations

What we optimized for

Stable p95 latency under production load

Pricing that reflects our infrastructure efficiency

High-throughput with failover to keep your product running

Zero data retention on prompts and outputs

Helping teams adapt AI to their world

Some teams process millions of short requests. Others run long-context analysis. Some need sub-second responses. All need best cost-performance LLM Inference. Our infrastructure handles these patterns reliably.

You’re adding summarization, classification, extraction, or generation into an existing product. You need inference that stays fast as usage grows and does not wreck unit economics.

Infrastructure you can trust under real traffic

Inference quality is not only the model. It is routing, isolation, scheduling, capacity planning and data handling.

Security

>
Ok
>
Ok
>
Enabled
>
Ram-only
>
Ram-only
>
On completion
>
Never
>
Never
>
Enabled
>
Ready

Infrastructure

EU-operated, direct control

Inference runs in our Slovenia (EU) data center, operated by our team with no third-party cloud abstractions.

High-throughput GPU clusters

B200, H200, and H100 capacity tuned for throughput and consistent performance under real traffic.

Optimized inference runtime

Scheduling, batching, and caching are engineered for higher GPU utilization, which lowers cost per request in production.

Elastic capacity under fair use

Most teams get effectively high-throughput with a clear path to dedicated capacity when you outgrow it.

Principles
we do not break.

We build with intention. These principles guide every decision we make - from infrastructure and performance to transparency and trust.

Leadership and accountability

Our leadership team brings deep expertise and a strong sense of accountability - guiding how Entrim grows and delivers.

Matjaž Mrgole

CEO

Matjaž Mrgole

Andraž Pavlič

COO

Andraž Pavlič

Matjaž Kavčič

CTO

Matjaž Kavčič

Benchmark your real workloads.

Run your token counts, latency targets, and traffic assumptions through Entrim. You will know if it fits before you migrate.

Your subscription could not be saved. Please try again.
Your subscription has been successful.
All services are online

© 2026. Entrim. All Rights Reserved.

Privacy policyTerms of service