Focusing on AI Systems - Raúl Ferrer

Deterministic Token Budgeting. Engineering Reliable Enterprise AI Through Dynamic JSON Schema Analysis

2026-03-08

Tokens

Loading stats...

max_tokens is not a tuning parameter but a reliability control surface. This approach models LLM output as a probabilistic upper bound derived from schema, tokenizer, and model priors. By combining dynamic estimation, constrained decoding, and production observability, it reduces latency variance, prevents verbosity failures, and enables auditable, robust and reliable enterprise AI systems.

1956 words

10 minutes

Hybrid Search in Enterprise RAG. Why Alpha Tuning Is an Architectural Decision

2026-03-05

Engineering

Hybrid Search

Semantic Search

BM25

Weaviate

RAG

EU AI Act

LangChain4

EducationAI

Curriculum

Correctness

Loading stats...

Pure vector search fails silently on exact identifiers. BM25 misses semantic paraphrases. In enterprise RAG, combining both signals isn't a performance trick — it's the minimum architecture for reliable retrieval under EU AI Act compliance. Here's what actually breaks in production, and how to fix it.

1035 words

5 minutes

The Embedding Gap. Why Your Vector Database Fails Before You Query It

2026-02-28

Architecture

RAG

Embeddings

Vector Search

Reliable Enterprise AI

LangChain4j

Weaviate

Ollama

EducationAI

Correctness

Loading stats...

Many enterprise RAG systems fail at retrieval. The root cause isn't your LLM or your vector DB, it's an embedding decision made in week one that no one revisited.

1295 words

6 minutes

How to Conduct a Fundamental Rights Impact Assessment (FRIA) for an EdTech AI System

2026-02-25

Compliance

EU AI Act

Compliance

High-Risk AI

K12

GDPR

Risk Management

Human-in-the-Loop

Ethics

Loading stats...

EU AI Act Article 27 requires deployers of high-risk AI in public institutions to conduct a FRIA. I could not find a worked example for educational AI. So I built one.

3166 words

16 minutes

The Token Tax. Why Your AI Strategy is Leaking Cash and How to Fix It

2026-02-21

Strategy & Leadership

Cost Management

API Limits

Tokens

Loading stats...

2025 was the year of the realization that AI is incredibly expensive. We’ve been living in an era of mindless model consumption, but the revolution for sustainability has started.

664 words

3 minutes

Reliable Enterprise AI. Why Architecture Matters More Than Prompts

2026-02-18

Architecture

Prompt Engineering

Robustness

Evaluation

EdTech

K12

HighRiskAI

Correctness

Loading stats...

In mission-critical environments, an AI that sounds confident but guesses is unacceptable. The difference between a fragile demo and a reliable system lies in engineering, not just clever prompting.

1120 words

6 minutes

AI in Education. Why K12 Platforms Require a Different Reliability Standard

2025-12-06

Compliance

K12

Reliability

Safety

Loading stats...

AI in K12 needs different reliability standards than enterprise AI. This edtech perspective examines the unique risk profiles, pedagogical requirements, specific data protections for minors like GDPR and the EU AI Act, and the crucial human oversight needed in educational AI platforms.

1161 words

6 minutes

Human-in-the-Loop Is Not a Feature, It's a Design Principle

2025-11-05

Architecture

Human-in-the-Loop

Guardrails

Loading stats...

Enterprise teams keep treating human oversight as a checkbox. It isn't. It's an architectural commitment that changes how you design retrieval, evaluation, and failure handling in Reliable Enterprise AI systems — from the ground up.

1240 words

6 minutes

3 4 5

Tech Lead |Software Architecture & Production Systems