Dziękujemy za wysłanie zapytania! Jeden z członków naszego zespołu skontaktuje się z Państwem wkrótce.
Dziękujemy za wysłanie rezerwacji! Jeden z członków naszego zespołu skontaktuje się z Państwem wkrótce.
Plan Szkolenia
AI Sovereignty and LLM Local Deployment
- Risks of cloud LLMs: data retention, training on inputs, foreign jurisdiction.
- Ollama architecture: model server, registry, and OpenAI-compatible API.
- Comparison with vLLM, llama.cpp, and Text Generation Inference.
- Model licensing: Llama, Mistral, Qwen, and Gemma terms.
Installation and Hardware Setup
- Installing Ollama on Linux with CUDA and ROCm support.
- CPU-only fallback and AVX/AVX2 optimization.
- Docker deployment and persistent volume mapping.
- Multi-GPU setup and VRAM allocation strategies.
Model Management
- Pulling models from the Ollama registry: ollama pull llama3.
- Importing GGUF models from HuggingFace and TheBloke.
- Quantization levels: Q4_K_M, Q5_K_M, Q8_0 tradeoffs.
- Model switching and concurrent model loading limits.
Custom Modelfiles
- Writing Modelfile syntax: FROM, PARAMETER, SYSTEM, TEMPLATE.
- Temperature, top_p, and repeat_penalty tuning.
- System prompt engineering for role-specific behavior.
- Creating and publishing custom models to local registry.
API Integration
- OpenAI-compatible /v1/chat/completions endpoint.
- Streaming responses and JSON mode.
- Integrating with LangChain, LlamaIndex, and custom apps.
- Authentication and rate limiting with reverse proxy.
Performance Optimization
- Context window sizing and KV cache management.
- Batch inference and parallel request handling.
- CPU thread allocation and NUMA awareness.
- Monitoring GPU utilization and memory pressure.
Security and Compliance
- Network isolation for model serving endpoints.
- Input filtering and output moderation pipelines.
- Audit logging of prompts and completions.
- Model provenance and hash verification.
Wymagania
- Intermediate Linux and container administration.
- Understanding of machine learning and transformer models at high level.
- Familiarity with REST APIs and JSON.
Audience
- AI engineers and developers replacing cloud LLM APIs.
- Organizations with data sensitivity preventing cloud model usage.
- Government and defense teams requiring air-gapped language models.
14 godzin