Przejdź do głównej zawartości

Unified AI System

Unified AI to centralny system zarządzania wszystkimi usługami AI w Vista - abstrakcja nad wieloma providerami z automatycznym failover i load balancing.

graph TB
subgraph "Frontend"
UI[React UI]
UAC[UnifiedAIClient]
end
subgraph "Backend (Tauri/Rust)"
CMD[AI Commands]
SR[Service Resolver]
HM[Health Monitor]
end
subgraph "AI Providers"
LIB[LibraxisAI Cloud<br/>Primary]
MLX[Local MLX<br/>Secondary]
OAI[OpenAI API<br/>Tertiary]
end
UI --> UAC
UAC --> CMD
CMD --> SR
SR --> HM
HM --> LIB
HM --> MLX
HM --> OAI
SR -.->|Priority 1| LIB
SR -.->|Priority 2| MLX
SR -.->|Priority 3| OAI

PriorityProviderLatencyCostCapabilities
1LibraxisAI Cloud~200msIncludedLLM, STT, TTS
2Local MLX~100msFreeLLM, STT (Apple Silicon only)
3OpenAI API~500ms$$$LLM, STT, TTS
// Service Resolver w Rust
pub struct ServiceResolver {
providers: Vec<Box<dyn AIProvider>>,
health_cache: RwLock<HashMap<String, ProviderHealth>>,
retry_config: RetryConfig,
}
impl ServiceResolver {
pub async fn execute<T>(
&self,
request: AIRequest,
operation: &str,
) -> Result<T, AIError> {
for provider in &self.providers {
// Skip unhealthy providers
if !self.is_healthy(provider.name()).await {
continue;
}
match provider.execute(&request).await {
Ok(response) => {
self.record_success(provider.name());
return Ok(response);
}
Err(e) => {
self.record_failure(provider.name(), &e);
// Continue to next provider
}
}
}
Err(AIError::AllProvidersFailed)
}
}

Zastosowania w Vista:

  • Generowanie notatek SOAP z transkrypcji
  • Sugestie diagnoz i leczenia
  • Czat asystent AI
  • Analiza dokumentacji medycznej
#[derive(Debug, Serialize, Deserialize)]
pub struct ChatRequest {
pub messages: Vec<ChatMessage>,
pub model: Option<String>,
pub temperature: Option<f32>,
pub max_tokens: Option<u32>,
pub stream: bool,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct ChatMessage {
pub role: String, // "system", "user", "assistant"
pub content: String,
}

Zastosowania:

  • Transkrypcja nagrań wizyt
  • Dyktowanie notatek
  • Voice commands (przyszłość)
#[derive(Debug, Serialize, Deserialize)]
pub struct TranscriptionRequest {
pub audio_path: String,
pub language: String, // "pl", "en"
pub enable_diarization: bool, // Speaker identification
pub word_timestamps: bool,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct TranscriptionResponse {
pub text: String,
pub segments: Vec<Segment>,
pub speakers: Option<Vec<Speaker>>,
pub language_detected: String,
pub duration_seconds: f64,
}

Zastosowania:

  • Odczytywanie notatek
  • Accessibility features
  • Synteza odpowiedzi AI
#[derive(Debug, Serialize, Deserialize)]
pub struct SynthesisRequest {
pub text: String,
pub voice: Option<String>, // Voice ID
pub speed: Option<f32>, // 0.5 - 2.0
pub format: Option<String>, // "mp3", "wav"
}

LibraxisAI to dedykowany cloud service dla Vista.

Base URL: https://api.libraxis.ai/v1
POST /chat/completions # LLM
POST /audio/transcriptions # STT
POST /audio/speech # TTS
GET /health # Health check
// API key stored in system keychain
let api_key = keychain::get("libraxis_api_key")?;
let client = reqwest::Client::new();
let response = client
.post("https://api.libraxis.ai/v1/chat/completions")
.header("Authorization", format!("Bearer {}", api_key))
.header("Content-Type", "application/json")
.json(&request)
.send()
.await?;
TierRequests/minTokens/min
Free2010,000
Pro100100,000
EnterpriseUnlimitedUnlimited

MLX - Apple’s ML framework dla Apple Silicon (M1/M2/M3).

Okno terminala
# Instalacja
pip install mlx-lm mlx-whisper
# Uruchomienie LLM server
mlx_lm.server --model mlx-community/Llama-3.2-3B-Instruct-4bit --port 8080
# Uruchomienie STT server
mlx_whisper.server --model mlx-community/whisper-large-v3-mlx --port 8081
#[tauri::command]
pub async fn get_mlx_health_detailed() -> Result<MLXHealthReport, String> {
let llm_check = check_service("http://localhost:8080/health").await;
let stt_check = check_service("http://localhost:8081/health").await;
Ok(MLXHealthReport {
llm_status: llm_check.status,
llm_response_time: llm_check.latency_ms,
llm_model: llm_check.model,
stt_status: stt_check.status,
stt_response_time: stt_check.latency_ms,
overall_health: llm_check.status == "healthy" || stt_check.status == "healthy",
})
}

OpenAI API jako ostatnia opcja gdy inne providery niedostępne.

Use CaseModelCost
Chatgpt-4o-mini$0.15/1M input
STTwhisper-1$0.006/min
TTStts-1$0.015/1K chars
// User settings
interface OpenAIConfig {
apiKey: string; // Stored in keychain
enableFallback: boolean; // Default: true
preferredModel: string; // Default: "gpt-4o-mini"
}

sequenceDiagram
participant UI as Frontend
participant BE as Backend
participant AI as AI Service
UI->>BE: generate_soap(visit_id, transcript)
BE->>BE: Load patient context
BE->>BE: Load user preferences
BE->>BE: Build system prompt
Note over BE: System Prompt includes:<br/>- Patient history<br/>- Visit type<br/>- User preferences<br/>- SOAP format rules
BE->>AI: Chat completion request
AI->>BE: SOAP response
BE->>BE: Parse & validate SOAP
BE->>BE: Extract AI suggestions
BE->>UI: SOAPNote + Suggestions
Jesteś asystentem weterynaryjnym. Twoim zadaniem jest wygenerowanie
notatki SOAP na podstawie transkrypcji wizyty.
## Kontekst pacjenta
- Imię: {{patient.name}}
- Gatunek: {{patient.species}}
- Rasa: {{patient.breed}}
- Wiek: {{patient.age}}
- Historia medyczna: {{patient.medical_conditions}}
## Preferencje użytkownika
- Styl dokumentacji: {{user.docs_style}}
- Poziom szczegółowości: {{user.ai_precision_level}}
- Format: {{user.format}}
## Transkrypcja
{{transcript}}
## Zadanie
Wygeneruj notatkę SOAP w formacie:
- Subjective (S): Obserwacje właściciela, historia
- Objective (O): Wyniki badania fizykalnego
- Assessment (A): Diagnoza, ocena kliniczna
- Plan (P): Plan leczenia, zalecenia
Dodatkowo zaproponuj:
- Możliwe follow-up zadania
- Przypomnienia dla właściciela

#[derive(Debug, thiserror::Error)]
pub enum AIError {
#[error("All providers failed")]
AllProvidersFailed,
#[error("Provider {0} returned error: {1}")]
ProviderError(String, String),
#[error("Rate limit exceeded for {0}")]
RateLimitExceeded(String),
#[error("Invalid request: {0}")]
InvalidRequest(String),
#[error("Timeout after {0}ms")]
Timeout(u64),
#[error("Network error: {0}")]
NetworkError(#[from] reqwest::Error),
}
pub struct RetryConfig {
pub max_attempts: u32, // Default: 3
pub initial_delay_ms: u64, // Default: 1000
pub max_delay_ms: u64, // Default: 10000
pub backoff_multiplier: f32, // Default: 2.0
}
impl RetryConfig {
pub fn should_retry(&self, error: &AIError, attempt: u32) -> bool {
if attempt >= self.max_attempts {
return false;
}
matches!(error,
AIError::Timeout(_) |
AIError::NetworkError(_) |
AIError::ProviderError(_, _)
)
}
}

MetricDescription
ai_request_totalTotal requests per provider
ai_request_duration_msRequest latency
ai_request_errorsErrors per provider
ai_tokens_usedToken consumption
ai_provider_healthProvider availability
#[derive(Debug, Serialize)]
pub struct AIHealthDashboard {
pub providers: Vec<ProviderStatus>,
pub total_requests_today: u64,
pub average_latency_ms: f64,
pub error_rate: f64,
pub primary_provider: String,
}

┌─────────────────────────────────────────────────────────────┐
│ LOCAL PROCESSING │
├─────────────────────────────────────────────────────────────┤
│ ✅ Audio recording │
│ ✅ Audio storage │
│ ✅ Database storage │
│ ✅ Local MLX inference (when available) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ CLOUD PROCESSING │
├─────────────────────────────────────────────────────────────┤
│ ⚠️ STT transcription → transcript text sent to cloud │
│ ⚠️ LLM generation → transcript + context sent to cloud │
│ ✅ No data retention by AI providers (per contract) │
│ ✅ TLS 1.3 encryption in transit │
└─────────────────────────────────────────────────────────────┘