Mission Control v1 – laeuft stabil
This commit is contained in:
@@ -0,0 +1,3 @@
|
|||||||
|
.venv/
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
@@ -0,0 +1,85 @@
|
|||||||
|
# Mission Control
|
||||||
|
|
||||||
|
Eine schlanke Steuerzentrale für deinen lokalen `llama-swap`-Stack auf dem Bosgame M5.
|
||||||
|
Ein FastAPI-Backend + ein HTML-Dashboard. Kein Build-Schritt, keine Datenbank.
|
||||||
|
|
||||||
|
## Was sie kann
|
||||||
|
|
||||||
|
- **Modelle & Ports** sehen — liest `llama-swap` (`/running`, `/v1/models`) + deine `config.yaml`
|
||||||
|
- **Modell holen** — lädt eine GGUF-Datei von HuggingFace (`hf download`) als Hintergrund-Job mit Live-Log
|
||||||
|
- **Einpflegen** — schreibt das Modell automatisch in deine `config.yaml`; `llama-swap` lädt mit `-watch-config` neu
|
||||||
|
- **Wartung** — Container/Toolbox aktualisieren, Modelle aus dem Speicher werfen
|
||||||
|
- **Schnelltest** — Chat-Box, um ein Modell zu wecken und zu prüfen
|
||||||
|
|
||||||
|
Was sie **bewusst nicht** macht: Chat-Logs, Inferenz-Monitoring im Detail — dafür hat `llama-swap`
|
||||||
|
schon `/ui` und `/log`. Mission Control ergänzt nur, was fehlt.
|
||||||
|
|
||||||
|
## Voraussetzungen
|
||||||
|
|
||||||
|
- Python 3.11+
|
||||||
|
- `hf` CLI installiert (`pip install -U "huggingface_hub[cli]"`)
|
||||||
|
- ein laufendes `llama-swap` — gestartet **mit `-watch-config`**, sonst greift das Auto-Einpflegen nicht
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo mkdir -p /opt/mission-control && sudo chown $USER /opt/mission-control
|
||||||
|
cp -r app.py static /opt/mission-control/
|
||||||
|
cd /opt/mission-control
|
||||||
|
python3 -m venv .venv && . .venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
uvicorn app:app --host 0.0.0.0 --port 9000
|
||||||
|
```
|
||||||
|
|
||||||
|
Dann im Browser im LAN: `http://<bosgame-ip>:9000`
|
||||||
|
|
||||||
|
### Als Dienst (Autostart)
|
||||||
|
|
||||||
|
Pfade in `mission-control.service` anpassen, dann:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo cp mission-control.service /etc/systemd/system/
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable --now mission-control
|
||||||
|
```
|
||||||
|
|
||||||
|
## Konfiguration (Umgebungsvariablen)
|
||||||
|
|
||||||
|
| Variable | Default | Zweck |
|
||||||
|
|---|---|---|
|
||||||
|
| `MC_LLAMA_SWAP_URL` | `http://127.0.0.1:8080` | wo `llama-swap` lauscht |
|
||||||
|
| `MC_CONFIG_PATH` | `/etc/llama-swap/config.yaml` | die `llama-swap`-Config |
|
||||||
|
| `MC_MODELS_DIR` | `/srv/models` | wohin GGUFs geladen werden |
|
||||||
|
| `MC_CMD_TEMPLATE` | siehe unten | Startbefehl pro Modell |
|
||||||
|
| `MC_UPDATE_CMD` | _(leer)_ | Befehl für „Container aktualisieren" |
|
||||||
|
| `MC_DEFAULT_TTL` | `300` | Sekunden bis Auto-Unload |
|
||||||
|
| `MC_TOKEN` | _(leer)_ | optionales Zugriffs-Token |
|
||||||
|
|
||||||
|
### Wichtig: `MC_CMD_TEMPLATE` an deinen Start anpassen
|
||||||
|
|
||||||
|
Das ist der Befehl, der pro Modell in die `config.yaml` geschrieben wird. `{model}` und `{ctx}`
|
||||||
|
werden von Mission Control ersetzt, **`${PORT}` bleibt stehen** (das ersetzt `llama-swap` selbst).
|
||||||
|
|
||||||
|
Direkt auf dem Host (llama-server im PATH):
|
||||||
|
```
|
||||||
|
llama-server -m {model} --host 127.0.0.1 --port ${PORT} -c {ctx} -ngl 999 -fa 1 --no-mmap
|
||||||
|
```
|
||||||
|
|
||||||
|
Über den kyuz0-Container (distrobox) — Beispiel, an deinen Toolbox-Namen anpassen:
|
||||||
|
```
|
||||||
|
distrobox enter llama-vulkan-radv -- llama-server -m {model} --host 127.0.0.1 --port ${PORT} -c {ctx} -ngl 999 -fa 1 --no-mmap
|
||||||
|
```
|
||||||
|
|
||||||
|
> `-fa 1` (Flash Attention) und `--no-mmap` sind auf Strix Halo Pflicht, sonst drohen Crashes/Slowdowns.
|
||||||
|
|
||||||
|
## ⚠️ Sicherheit
|
||||||
|
|
||||||
|
Mission Control führt Shell-Befehle aus (Downloads, Updates) und schreibt deine Config.
|
||||||
|
**Niemals offen ins Internet hängen.** Betrieb nur im vertrauenswürdigen LAN. Wenn du
|
||||||
|
trotzdem etwas Schutz willst, setz `MC_TOKEN` und trag es oben rechts im Dashboard ein.
|
||||||
|
Für echten Remote-Zugriff: per SSH-Tunnel oder Tailscale, nicht per Portfreigabe.
|
||||||
|
|
||||||
|
## API (falls du es skripten willst)
|
||||||
|
|
||||||
|
`GET /api/status` · `POST /api/download` · `POST /api/register` · `POST /api/unload[?model=]`
|
||||||
|
· `POST /api/update` · `POST /api/chat` · `GET /api/jobs` · `GET /api/jobs/{id}`
|
||||||
@@ -0,0 +1,279 @@
|
|||||||
|
"""
|
||||||
|
Mission Control - eine schlanke Steuerzentrale fuer einen lokalen llama-swap Stack.
|
||||||
|
|
||||||
|
Was sie macht:
|
||||||
|
- zeigt konfigurierte + laufende Modelle und ihre Ports (liest llama-swap /running + /v1/models)
|
||||||
|
- laedt neue GGUF-Modelle von HuggingFace (hf download, als Hintergrund-Job)
|
||||||
|
- pflegt ein heruntergeladenes Modell automatisch in die llama-swap config.yaml ein
|
||||||
|
- stoesst Updates an (Container / Toolbox refresh)
|
||||||
|
- laedt Modelle aus dem Speicher (unload) und hat einen kleinen Chat-Test
|
||||||
|
|
||||||
|
Bewusst KISS: ein File, In-Memory Jobs, keine Datenbank.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import shlex
|
||||||
|
import subprocess
|
||||||
|
import threading
|
||||||
|
import time
|
||||||
|
import uuid
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from fastapi import Depends, FastAPI, Header, HTTPException
|
||||||
|
from fastapi.responses import FileResponse, JSONResponse
|
||||||
|
from fastapi.staticfiles import StaticFiles
|
||||||
|
from pydantic import BaseModel
|
||||||
|
from ruamel.yaml import YAML
|
||||||
|
from ruamel.yaml.scalarstring import LiteralScalarString
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Konfiguration (alles ueber Umgebungsvariablen ueberschreibbar)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
LLAMA_SWAP_URL = os.environ.get("MC_LLAMA_SWAP_URL", "http://127.0.0.1:8080").rstrip("/")
|
||||||
|
CONFIG_PATH = Path(os.environ.get("MC_CONFIG_PATH", "/etc/llama-swap/config.yaml"))
|
||||||
|
MODELS_DIR = Path(os.environ.get("MC_MODELS_DIR", "/srv/models"))
|
||||||
|
# Befehl, der zum Starten eines Modells in die config.yaml geschrieben wird.
|
||||||
|
# {model} = Pfad zur GGUF-Datei, {ctx} = Kontextlaenge, ${PORT} bleibt fuer llama-swap stehen.
|
||||||
|
# WICHTIG: an deinen Container-/llama-server-Aufruf anpassen (siehe README).
|
||||||
|
CMD_TEMPLATE = os.environ.get(
|
||||||
|
"MC_CMD_TEMPLATE",
|
||||||
|
"llama-server -m {model} --host 127.0.0.1 --port ${PORT} "
|
||||||
|
"-c {ctx} -ngl 999 -fa 1 --no-mmap",
|
||||||
|
)
|
||||||
|
# Befehl fuer "Container/Toolbox aktualisieren". Standard: kyuz0 refresh-Skript.
|
||||||
|
UPDATE_CMD = os.environ.get("MC_UPDATE_CMD", "")
|
||||||
|
DEFAULT_TTL = int(os.environ.get("MC_DEFAULT_TTL", "300"))
|
||||||
|
TOKEN = os.environ.get("MC_TOKEN", "") # leer = keine Auth (nur LAN!)
|
||||||
|
|
||||||
|
yaml = YAML()
|
||||||
|
yaml.preserve_quotes = True
|
||||||
|
|
||||||
|
app = FastAPI(title="Mission Control")
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Mini Job-System (Hintergrund-Prozesse mit Live-Log)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
JOBS: dict[str, dict] = {}
|
||||||
|
_LOG_CAP = 400
|
||||||
|
|
||||||
|
|
||||||
|
def _run_job(job_id: str, args: list[str], env: dict | None = None):
|
||||||
|
job = JOBS[job_id]
|
||||||
|
job["state"] = "running"
|
||||||
|
try:
|
||||||
|
proc = subprocess.Popen(
|
||||||
|
args,
|
||||||
|
stdout=subprocess.PIPE,
|
||||||
|
stderr=subprocess.STDOUT,
|
||||||
|
text=True,
|
||||||
|
bufsize=1,
|
||||||
|
env={**os.environ, **(env or {})},
|
||||||
|
)
|
||||||
|
for line in proc.stdout: # type: ignore[union-attr]
|
||||||
|
job["log"].append(line.rstrip("\n"))
|
||||||
|
if len(job["log"]) > _LOG_CAP:
|
||||||
|
del job["log"][0]
|
||||||
|
proc.wait()
|
||||||
|
job["returncode"] = proc.returncode
|
||||||
|
job["state"] = "done" if proc.returncode == 0 else "failed"
|
||||||
|
except Exception as exc: # noqa: BLE001
|
||||||
|
job["log"].append(f"[mission-control] Fehler: {exc}")
|
||||||
|
job["state"] = "failed"
|
||||||
|
job["returncode"] = -1
|
||||||
|
job["finished_at"] = time.time()
|
||||||
|
|
||||||
|
|
||||||
|
def start_job(args: list[str], label: str, env: dict | None = None) -> str:
|
||||||
|
job_id = uuid.uuid4().hex[:12]
|
||||||
|
JOBS[job_id] = {
|
||||||
|
"id": job_id,
|
||||||
|
"label": label,
|
||||||
|
"state": "queued",
|
||||||
|
"log": [f"$ {' '.join(shlex.quote(a) for a in args)}"],
|
||||||
|
"returncode": None,
|
||||||
|
"started_at": time.time(),
|
||||||
|
"finished_at": None,
|
||||||
|
}
|
||||||
|
threading.Thread(target=_run_job, args=(job_id, args, env), daemon=True).start()
|
||||||
|
return job_id
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Auth (optional)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
def auth(x_mc_token: str = Header(default="")):
|
||||||
|
if TOKEN and x_mc_token != TOKEN:
|
||||||
|
raise HTTPException(status_code=401, detail="Falsches oder fehlendes Token.")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# llama-swap Helfer
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
def _swap_get(path: str):
|
||||||
|
with httpx.Client(timeout=5.0) as c:
|
||||||
|
r = c.get(f"{LLAMA_SWAP_URL}{path}")
|
||||||
|
r.raise_for_status()
|
||||||
|
return r.json()
|
||||||
|
|
||||||
|
|
||||||
|
def read_config() -> dict:
|
||||||
|
if not CONFIG_PATH.exists():
|
||||||
|
return {"models": {}}
|
||||||
|
with CONFIG_PATH.open("r", encoding="utf-8") as f:
|
||||||
|
data = yaml.load(f) or {}
|
||||||
|
if "models" not in data or data["models"] is None:
|
||||||
|
data["models"] = {}
|
||||||
|
return data
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Request-Modelle
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
class DownloadReq(BaseModel):
|
||||||
|
repo: str
|
||||||
|
file: str
|
||||||
|
subdir: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class RegisterReq(BaseModel):
|
||||||
|
alias: str
|
||||||
|
model_path: str
|
||||||
|
ctx: int = 8192
|
||||||
|
ttl: int | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class ChatReq(BaseModel):
|
||||||
|
model: str
|
||||||
|
message: str
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# API
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
@app.get("/api/status", dependencies=[Depends(auth)])
|
||||||
|
def status():
|
||||||
|
cfg = read_config()
|
||||||
|
configured = {}
|
||||||
|
for name, spec in (cfg.get("models") or {}).items():
|
||||||
|
spec = spec or {}
|
||||||
|
configured[name] = {
|
||||||
|
"name": name,
|
||||||
|
"ttl": spec.get("ttl", cfg.get("globalTTL", 0)),
|
||||||
|
"cmd": str(spec.get("cmd", "")).strip(),
|
||||||
|
"state": "idle",
|
||||||
|
"port": None,
|
||||||
|
}
|
||||||
|
swap_ok = True
|
||||||
|
try:
|
||||||
|
running = _swap_get("/running")
|
||||||
|
items = running.get("running", running) if isinstance(running, dict) else running
|
||||||
|
for item in items or []:
|
||||||
|
mid = item.get("model") or item.get("id") or item.get("name")
|
||||||
|
if mid in configured:
|
||||||
|
configured[mid]["state"] = item.get("state", "running")
|
||||||
|
configured[mid]["port"] = item.get("port")
|
||||||
|
elif mid:
|
||||||
|
configured[mid] = {
|
||||||
|
"name": mid, "ttl": None, "cmd": "",
|
||||||
|
"state": item.get("state", "running"), "port": item.get("port"),
|
||||||
|
}
|
||||||
|
except Exception: # noqa: BLE001
|
||||||
|
swap_ok = False
|
||||||
|
return {
|
||||||
|
"swap_ok": swap_ok,
|
||||||
|
"swap_url": LLAMA_SWAP_URL,
|
||||||
|
"config_path": str(CONFIG_PATH),
|
||||||
|
"models_dir": str(MODELS_DIR),
|
||||||
|
"models": list(configured.values()),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/download", dependencies=[Depends(auth)])
|
||||||
|
def download(req: DownloadReq):
|
||||||
|
sub = req.subdir or req.repo.split("/")[-1]
|
||||||
|
target = MODELS_DIR / sub
|
||||||
|
target.mkdir(parents=True, exist_ok=True)
|
||||||
|
args = ["hf", "download", req.repo, req.file, "--local-dir", str(target)]
|
||||||
|
job_id = start_job(args, f"download {req.repo}/{req.file}",
|
||||||
|
env={"HF_XET_HIGH_PERFORMANCE": "1"})
|
||||||
|
JOBS[job_id]["result_path"] = str(target / req.file)
|
||||||
|
return {"job_id": job_id, "expected_path": str(target / req.file)}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/register", dependencies=[Depends(auth)])
|
||||||
|
def register(req: RegisterReq):
|
||||||
|
if not Path(req.model_path).exists():
|
||||||
|
raise HTTPException(404, f"Datei nicht gefunden: {req.model_path}")
|
||||||
|
cfg = read_config()
|
||||||
|
cmd = CMD_TEMPLATE.replace("{model}", req.model_path).replace("{ctx}", str(req.ctx))
|
||||||
|
cfg["models"][req.alias] = {
|
||||||
|
"cmd": LiteralScalarString(cmd + "\n"),
|
||||||
|
"ttl": req.ttl if req.ttl is not None else DEFAULT_TTL,
|
||||||
|
}
|
||||||
|
CONFIG_PATH.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
with CONFIG_PATH.open("w", encoding="utf-8") as f:
|
||||||
|
yaml.dump(cfg, f)
|
||||||
|
return {"ok": True, "alias": req.alias,
|
||||||
|
"note": "In config.yaml geschrieben. llama-swap mit -watch-config laedt automatisch neu."}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/unload", dependencies=[Depends(auth)])
|
||||||
|
def unload(model: str | None = None):
|
||||||
|
path = f"/api/models/unload/{model}" if model else "/api/models/unload"
|
||||||
|
try:
|
||||||
|
with httpx.Client(timeout=10.0) as c:
|
||||||
|
r = c.post(f"{LLAMA_SWAP_URL}{path}")
|
||||||
|
return {"ok": r.status_code < 400, "status": r.status_code}
|
||||||
|
except Exception as exc: # noqa: BLE001
|
||||||
|
raise HTTPException(502, f"llama-swap nicht erreichbar: {exc}")
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/update", dependencies=[Depends(auth)])
|
||||||
|
def update():
|
||||||
|
if not UPDATE_CMD:
|
||||||
|
raise HTTPException(400, "Kein Update-Befehl gesetzt (MC_UPDATE_CMD).")
|
||||||
|
job_id = start_job(shlex.split(UPDATE_CMD), "update containers")
|
||||||
|
return {"job_id": job_id}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/chat", dependencies=[Depends(auth)])
|
||||||
|
def chat(req: ChatReq):
|
||||||
|
payload = {"model": req.model, "messages": [{"role": "user", "content": req.message}]}
|
||||||
|
try:
|
||||||
|
with httpx.Client(timeout=120.0) as c:
|
||||||
|
r = c.post(f"{LLAMA_SWAP_URL}/v1/chat/completions", json=payload)
|
||||||
|
r.raise_for_status()
|
||||||
|
data = r.json()
|
||||||
|
return {"reply": data["choices"][0]["message"]["content"]}
|
||||||
|
except Exception as exc: # noqa: BLE001
|
||||||
|
raise HTTPException(502, f"Anfrage fehlgeschlagen: {exc}")
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/jobs/{job_id}", dependencies=[Depends(auth)])
|
||||||
|
def job_status(job_id: str):
|
||||||
|
job = JOBS.get(job_id)
|
||||||
|
if not job:
|
||||||
|
raise HTTPException(404, "Job nicht gefunden.")
|
||||||
|
return job
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/jobs", dependencies=[Depends(auth)])
|
||||||
|
def jobs_list():
|
||||||
|
return sorted(JOBS.values(), key=lambda j: j["started_at"], reverse=True)[:20]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Statisches UI
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
@app.get("/")
|
||||||
|
def index():
|
||||||
|
return FileResponse(Path(__file__).parent / "static" / "index.html")
|
||||||
|
|
||||||
|
|
||||||
|
app.mount("/static", StaticFiles(directory=Path(__file__).parent / "static"), name="static")
|
||||||
|
|
||||||
|
|
||||||
|
@app.exception_handler(HTTPException)
|
||||||
|
def _http_exc(_req, exc: HTTPException):
|
||||||
|
return JSONResponse(status_code=exc.status_code, content={"error": exc.detail})
|
||||||
@@ -0,0 +1,26 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=Mission Control (lokaler LLM-Stack)
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=tobi
|
||||||
|
WorkingDirectory=/opt/mission-control
|
||||||
|
# Pfade an deinen Stack anpassen:
|
||||||
|
Environment=MC_LLAMA_SWAP_URL=http://127.0.0.1:8080
|
||||||
|
Environment=MC_CONFIG_PATH=/etc/llama-swap/config.yaml
|
||||||
|
Environment=MC_MODELS_DIR=/srv/models
|
||||||
|
Environment=MC_DEFAULT_TTL=300
|
||||||
|
# Befehl zum Starten eines Modells (in config.yaml geschrieben). ${PORT} bleibt stehen!
|
||||||
|
Environment=MC_CMD_TEMPLATE=llama-server -m {model} --host 127.0.0.1 --port ${PORT} -c {ctx} -ngl 999 -fa 1 --no-mmap
|
||||||
|
# Update-Befehl (z.B. kyuz0 refresh oder podman pull):
|
||||||
|
Environment=MC_UPDATE_CMD=/opt/amd-strix-halo-toolboxes/refresh-toolboxes.sh all
|
||||||
|
# Optionales Token fuer minimale Absicherung im LAN:
|
||||||
|
# Environment=MC_TOKEN=einlangesgeheimnis
|
||||||
|
ExecStart=/opt/mission-control/.venv/bin/uvicorn app:app --host 0.0.0.0 --port 9000
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=3
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
@@ -0,0 +1,4 @@
|
|||||||
|
fastapi>=0.110
|
||||||
|
uvicorn[standard]>=0.29
|
||||||
|
httpx>=0.27
|
||||||
|
ruamel.yaml>=0.18
|
||||||
@@ -0,0 +1,248 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="de">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||||
|
<title>Mission Control</title>
|
||||||
|
<style>
|
||||||
|
:root{
|
||||||
|
--bg:#0d1117; --panel:#151b23; --panel2:#1b222c; --line:rgba(255,255,255,.08);
|
||||||
|
--line2:rgba(255,255,255,.14); --tx:#d7dee7; --mut:#8b97a5;
|
||||||
|
--on:#46c06a; --warn:#e0a32e; --err:#e5534b; --act:#4493e0;
|
||||||
|
--mono:ui-monospace,"SF Mono",Menlo,Consolas,monospace;
|
||||||
|
--sans:system-ui,-apple-system,Segoe UI,Roboto,sans-serif;
|
||||||
|
}
|
||||||
|
*{box-sizing:border-box}
|
||||||
|
body{margin:0;background:var(--bg);color:var(--tx);font-family:var(--sans);font-size:15px;line-height:1.5}
|
||||||
|
a{color:var(--act)}
|
||||||
|
.wrap{max-width:1040px;margin:0 auto;padding:0 20px 64px}
|
||||||
|
header{display:flex;align-items:center;gap:16px;padding:20px 0 18px;border-bottom:1px solid var(--line);
|
||||||
|
position:sticky;top:0;background:var(--bg);z-index:5;flex-wrap:wrap}
|
||||||
|
.brand{font-weight:600;letter-spacing:.2px;font-size:18px}
|
||||||
|
.brand b{color:var(--on)}
|
||||||
|
.pill{display:inline-flex;align-items:center;gap:8px;font-family:var(--mono);font-size:12.5px;
|
||||||
|
padding:5px 11px;border:1px solid var(--line);border-radius:999px;color:var(--mut);background:var(--panel)}
|
||||||
|
.dot{width:8px;height:8px;border-radius:50%;background:var(--mut)}
|
||||||
|
.dot.on{background:var(--on);box-shadow:0 0 0 0 rgba(70,192,106,.5);animation:pulse 2.2s infinite}
|
||||||
|
.dot.off{background:var(--err)}
|
||||||
|
@keyframes pulse{0%{box-shadow:0 0 0 0 rgba(70,192,106,.45)}70%{box-shadow:0 0 0 7px rgba(70,192,106,0)}100%{box-shadow:0 0 0 0 rgba(70,192,106,0)}}
|
||||||
|
.spacer{flex:1}
|
||||||
|
.tokin{font-family:var(--mono);font-size:12.5px;background:var(--panel);border:1px solid var(--line);
|
||||||
|
color:var(--tx);border-radius:8px;padding:6px 9px;width:130px}
|
||||||
|
h2{font-size:14px;font-weight:600;letter-spacing:.4px;text-transform:uppercase;color:var(--mut);margin:0 0 12px}
|
||||||
|
.grid{display:grid;grid-template-columns:1fr 1fr;gap:18px;margin-top:22px}
|
||||||
|
@media(max-width:780px){.grid{grid-template-columns:1fr}}
|
||||||
|
.card{background:var(--panel);border:1px solid var(--line);border-radius:14px;padding:18px}
|
||||||
|
.card.full{grid-column:1/-1}
|
||||||
|
table{width:100%;border-collapse:collapse;font-size:14px}
|
||||||
|
th{text-align:left;font-weight:500;color:var(--mut);font-size:12px;text-transform:uppercase;
|
||||||
|
letter-spacing:.4px;padding:0 10px 9px}
|
||||||
|
td{padding:11px 10px;border-top:1px solid var(--line)}
|
||||||
|
.mid{font-family:var(--mono);font-size:13px;color:#e8eef5}
|
||||||
|
.badge{font-family:var(--mono);font-size:11.5px;padding:3px 9px;border-radius:6px;display:inline-block}
|
||||||
|
.b-run{background:rgba(70,192,106,.14);color:var(--on)}
|
||||||
|
.b-idle{background:rgba(139,151,165,.14);color:var(--mut)}
|
||||||
|
.b-load{background:rgba(224,163,46,.16);color:var(--warn)}
|
||||||
|
.port{font-family:var(--mono);color:var(--mut);font-size:13px}
|
||||||
|
.empty{color:var(--mut);font-size:14px;padding:14px 4px}
|
||||||
|
label{display:block;font-size:12.5px;color:var(--mut);margin:0 0 5px}
|
||||||
|
input,textarea,select{width:100%;background:var(--panel2);border:1px solid var(--line);color:var(--tx);
|
||||||
|
border-radius:8px;padding:9px 11px;font-family:var(--mono);font-size:13px;margin-bottom:12px}
|
||||||
|
textarea{resize:vertical;min-height:64px;font-family:var(--sans)}
|
||||||
|
.row{display:flex;gap:10px}.row>div{flex:1}
|
||||||
|
button{font-family:var(--sans);font-size:13.5px;font-weight:500;border:1px solid var(--line2);
|
||||||
|
background:var(--panel2);color:var(--tx);border-radius:8px;padding:9px 15px;cursor:pointer}
|
||||||
|
button:hover{border-color:var(--act)}
|
||||||
|
button.primary{background:var(--act);border-color:var(--act);color:#fff}
|
||||||
|
button.primary:hover{filter:brightness(1.08)}
|
||||||
|
button.ghost{padding:5px 11px;font-size:12.5px}
|
||||||
|
button:disabled{opacity:.5;cursor:not-allowed}
|
||||||
|
.reply{margin-top:12px;background:var(--panel2);border:1px solid var(--line);border-radius:8px;
|
||||||
|
padding:12px;white-space:pre-wrap;font-size:14px;min-height:20px;color:var(--tx)}
|
||||||
|
.log{font-family:var(--mono);font-size:12px;line-height:1.65;background:#0a0e13;border:1px solid var(--line);
|
||||||
|
border-radius:8px;padding:12px;max-height:240px;overflow:auto;white-space:pre-wrap;color:#aeb9c4}
|
||||||
|
.job{border:1px solid var(--line);border-radius:8px;margin-bottom:8px;overflow:hidden}
|
||||||
|
.job-h{display:flex;align-items:center;gap:10px;padding:10px 12px;cursor:pointer}
|
||||||
|
.job-h .mid{flex:1}
|
||||||
|
.toast{position:fixed;bottom:20px;left:50%;transform:translateX(-50%);background:#1b222c;
|
||||||
|
border:1px solid var(--line2);border-radius:10px;padding:11px 16px;font-size:13.5px;
|
||||||
|
opacity:0;transition:.25s;pointer-events:none;max-width:90vw}
|
||||||
|
.toast.show{opacity:1}
|
||||||
|
.toast.err{border-color:var(--err);color:#ffb4ae}
|
||||||
|
.hint{font-size:12px;color:var(--mut);margin:-4px 0 12px}
|
||||||
|
.mono-sm{font-family:var(--mono);font-size:11.5px;color:var(--mut)}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="wrap">
|
||||||
|
<header>
|
||||||
|
<span class="brand">Mission <b>Control</b></span>
|
||||||
|
<span class="pill"><span id="hdot" class="dot"></span><span id="hlabel">verbinde…</span></span>
|
||||||
|
<span class="spacer"></span>
|
||||||
|
<input id="token" class="tokin" placeholder="Token (optional)" autocomplete="off">
|
||||||
|
</header>
|
||||||
|
|
||||||
|
<div class="grid">
|
||||||
|
<!-- MODELLE -->
|
||||||
|
<div class="card full">
|
||||||
|
<h2>Modelle & Ports</h2>
|
||||||
|
<table>
|
||||||
|
<thead><tr><th>Modell</th><th>Status</th><th>Port</th><th style="text-align:right">Aktion</th></tr></thead>
|
||||||
|
<tbody id="models"></tbody>
|
||||||
|
</table>
|
||||||
|
<div id="models-empty" class="empty" style="display:none">Noch keine Modelle konfiguriert — zieh dir unten eins rein. 👇</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- DOWNLOAD -->
|
||||||
|
<div class="card">
|
||||||
|
<h2>Modell holen</h2>
|
||||||
|
<label>HuggingFace-Repo</label>
|
||||||
|
<input id="dl-repo" placeholder="unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF">
|
||||||
|
<label>Datei (GGUF)</label>
|
||||||
|
<input id="dl-file" placeholder="Q4_K_M/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf">
|
||||||
|
<button class="primary" onclick="pull()">Modell herunterladen</button>
|
||||||
|
<div id="register-box" style="display:none;margin-top:16px;border-top:1px solid var(--line);padding-top:14px">
|
||||||
|
<h2>Einpflegen</h2>
|
||||||
|
<div class="row">
|
||||||
|
<div><label>Alias</label><input id="rg-alias"></div>
|
||||||
|
<div><label>Kontext</label><input id="rg-ctx" value="8192"></div>
|
||||||
|
</div>
|
||||||
|
<input id="rg-path" class="mono-sm" readonly>
|
||||||
|
<button class="primary" onclick="register()">In Config eintragen</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- WARTUNG + TEST -->
|
||||||
|
<div class="card">
|
||||||
|
<h2>Wartung</h2>
|
||||||
|
<button onclick="update()">Container aktualisieren</button>
|
||||||
|
<button onclick="unloadAll()">Alles aus dem Speicher</button>
|
||||||
|
<div class="hint" style="margin-top:10px">Update-Befehl wird per <span class="mono-sm">MC_UPDATE_CMD</span> gesetzt.</div>
|
||||||
|
<h2 style="margin-top:18px">Schnelltest</h2>
|
||||||
|
<select id="chat-model"></select>
|
||||||
|
<textarea id="chat-msg" placeholder="Schreib was, um ein Modell zu wecken…"></textarea>
|
||||||
|
<button class="primary" onclick="sendChat()" id="chat-btn">Senden</button>
|
||||||
|
<div id="chat-reply" class="reply" style="display:none"></div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- AKTIVITAET -->
|
||||||
|
<div class="card full">
|
||||||
|
<h2>Aktivität</h2>
|
||||||
|
<div id="jobs"></div>
|
||||||
|
<div id="jobs-empty" class="empty">Noch nichts losgemacht.</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div id="toast" class="toast"></div>
|
||||||
|
|
||||||
|
<script>
|
||||||
|
const $ = s => document.querySelector(s);
|
||||||
|
let TOKEN = localStorage.getItem("mc_token") || "";
|
||||||
|
$("#token").value = TOKEN;
|
||||||
|
$("#token").addEventListener("change", e => { TOKEN = e.target.value.trim(); localStorage.setItem("mc_token", TOKEN); refresh(); });
|
||||||
|
|
||||||
|
function hdr(){ return TOKEN ? {"Content-Type":"application/json","X-MC-Token":TOKEN} : {"Content-Type":"application/json"}; }
|
||||||
|
async function api(path, opts={}){
|
||||||
|
const r = await fetch(path, {headers: hdr(), ...opts});
|
||||||
|
const data = await r.json().catch(()=>({}));
|
||||||
|
if(!r.ok) throw new Error(data.error || ("HTTP "+r.status));
|
||||||
|
return data;
|
||||||
|
}
|
||||||
|
let _tt;
|
||||||
|
function toast(msg, err=false){
|
||||||
|
const t = $("#toast"); t.textContent = msg; t.className = "toast show" + (err?" err":"");
|
||||||
|
clearTimeout(_tt); _tt = setTimeout(()=>t.className="toast",3200);
|
||||||
|
}
|
||||||
|
|
||||||
|
function badge(state){
|
||||||
|
if(state==="running" || state==="ready") return '<span class="badge b-run">geladen</span>';
|
||||||
|
if(state==="loading" || state==="starting") return '<span class="badge b-load">lädt…</span>';
|
||||||
|
return '<span class="badge b-idle">bereit</span>';
|
||||||
|
}
|
||||||
|
|
||||||
|
async function refresh(){
|
||||||
|
let s;
|
||||||
|
try{ s = await api("/api/status"); }
|
||||||
|
catch(e){ $("#hdot").className="dot off"; $("#hlabel").textContent="Backend nicht erreichbar"; return; }
|
||||||
|
const ok = s.swap_ok;
|
||||||
|
$("#hdot").className = "dot " + (ok?"on":"off");
|
||||||
|
$("#hlabel").textContent = (ok?"llama-swap online · ":"llama-swap offline · ") + s.swap_url.replace(/^https?:\/\//,"");
|
||||||
|
|
||||||
|
const tb = $("#models"); tb.innerHTML = "";
|
||||||
|
$("#models-empty").style.display = s.models.length ? "none" : "block";
|
||||||
|
const sel = $("#chat-model"); const cur = sel.value; sel.innerHTML = "";
|
||||||
|
for(const m of s.models){
|
||||||
|
const tr = document.createElement("tr");
|
||||||
|
tr.innerHTML = `<td class="mid">${m.name}</td><td>${badge(m.state)}</td>
|
||||||
|
<td class="port">${m.port ?? "auto"}</td>
|
||||||
|
<td style="text-align:right"><button class="ghost" onclick="unloadOne('${m.name}')">Entladen</button></td>`;
|
||||||
|
tb.appendChild(tr);
|
||||||
|
sel.insertAdjacentHTML("beforeend", `<option>${m.name}</option>`);
|
||||||
|
}
|
||||||
|
if(cur) sel.value = cur;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function pull(){
|
||||||
|
const repo = $("#dl-repo").value.trim(), file = $("#dl-file").value.trim();
|
||||||
|
if(!repo || !file) return toast("Repo und Datei angeben.", true);
|
||||||
|
try{
|
||||||
|
const r = await api("/api/download", {method:"POST", body: JSON.stringify({repo, file})});
|
||||||
|
toast("Download gestartet.");
|
||||||
|
const stem = file.split("/").pop().replace(/\.gguf$/i,"");
|
||||||
|
$("#rg-alias").value = stem; $("#rg-path").value = r.expected_path;
|
||||||
|
$("#register-box").style.display = "block";
|
||||||
|
trackJob(r.job_id);
|
||||||
|
}catch(e){ toast(e.message, true); }
|
||||||
|
}
|
||||||
|
|
||||||
|
async function register(){
|
||||||
|
const alias = $("#rg-alias").value.trim(), model_path = $("#rg-path").value, ctx = parseInt($("#rg-ctx").value)||8192;
|
||||||
|
if(!alias) return toast("Alias angeben.", true);
|
||||||
|
try{ await api("/api/register",{method:"POST",body:JSON.stringify({alias,model_path,ctx})});
|
||||||
|
toast("Eingepflegt — llama-swap lädt neu."); refresh();
|
||||||
|
}catch(e){ toast(e.message, true); }
|
||||||
|
}
|
||||||
|
|
||||||
|
async function unloadOne(m){ try{ await api("/api/unload?model="+encodeURIComponent(m),{method:"POST"}); toast("Entladen: "+m); setTimeout(refresh,600);}catch(e){toast(e.message,true);} }
|
||||||
|
async function unloadAll(){ try{ await api("/api/unload",{method:"POST"}); toast("Alle Modelle entladen."); setTimeout(refresh,600);}catch(e){toast(e.message,true);} }
|
||||||
|
async function update(){ try{ const r = await api("/api/update",{method:"POST"}); toast("Update läuft."); trackJob(r.job_id);}catch(e){toast(e.message,true);} }
|
||||||
|
|
||||||
|
async function sendChat(){
|
||||||
|
const model = $("#chat-model").value, message = $("#chat-msg").value.trim();
|
||||||
|
if(!model) return toast("Kein Modell vorhanden.", true);
|
||||||
|
if(!message) return;
|
||||||
|
const btn = $("#chat-btn"); btn.disabled = true; btn.textContent = "…";
|
||||||
|
const box = $("#chat-reply"); box.style.display="block"; box.textContent="(wecke Modell, kann beim Swap kurz dauern…)";
|
||||||
|
try{ const r = await api("/api/chat",{method:"POST",body:JSON.stringify({model,message})}); box.textContent = r.reply; }
|
||||||
|
catch(e){ box.textContent = "Fehler: "+e.message; }
|
||||||
|
btn.disabled=false; btn.textContent="Senden"; refresh();
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- Jobs ---
|
||||||
|
const tracked = new Set();
|
||||||
|
function trackJob(id){ tracked.add(id); renderJobs(); }
|
||||||
|
async function renderJobs(){
|
||||||
|
let jobs;
|
||||||
|
try{ jobs = await api("/api/jobs"); }catch(e){ return; }
|
||||||
|
$("#jobs-empty").style.display = jobs.length ? "none" : "block";
|
||||||
|
const c = $("#jobs"); c.innerHTML = "";
|
||||||
|
for(const j of jobs){
|
||||||
|
const open = tracked.has(j.id);
|
||||||
|
const st = j.state==="done" ? '<span class="badge b-run">fertig</span>'
|
||||||
|
: j.state==="failed" ? '<span class="badge" style="background:rgba(229,83,75,.16);color:#e5534b">fehler</span>'
|
||||||
|
: '<span class="badge b-load">läuft…</span>';
|
||||||
|
const div = document.createElement("div"); div.className="job";
|
||||||
|
div.innerHTML = `<div class="job-h" onclick="toggleJob('${j.id}')">
|
||||||
|
<span class="mid">${j.label}</span>${st}</div>
|
||||||
|
${open ? `<div class="log">${(j.log||[]).join("\n").replace(/</g,"<")}</div>` : ""}`;
|
||||||
|
c.appendChild(div);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
function toggleJob(id){ tracked.has(id) ? tracked.delete(id) : tracked.add(id); renderJobs(); }
|
||||||
|
|
||||||
|
refresh(); renderJobs();
|
||||||
|
setInterval(refresh, 3000);
|
||||||
|
setInterval(renderJobs, 1500);
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
Reference in New Issue
Block a user