📦 TurboQuant+ KV Cache Compression — 6.4倍KV缓存压缩

v1.0.0

TurboQuant+ 在 Apple Silicon 上无损压缩 llama.cpp 的 KV 缓存达 6.4 倍,显存占用骤降,可跑更大模型、更长上下文,推理速度几乎不降。

0· 95·0 当前·0 累计
下载技能包
最后更新
2026/4/4
0
安全扫描
VirusTotal
可疑
查看报告
OpenClaw
安全
high confidence
The skill's instructions, required actions, and lack of credentials align with its stated purpose of configuring TurboQuant+ for llama.cpp on Apple Silicon; it does recommend building third‑party code and making a privileged sysctl change, which are expected for this use but carry normal operational risk.
评估建议
This skill appears coherent for configuring TurboQuant+ with llama.cpp, but follow these precautions before proceeding: 1) Verify the external GitHub fork (TheTom/llama-cpp-turboquant) is the intended project and review its source/commit history before building. 2) Build and run the code in an isolated or trusted environment (container, dedicated machine) if possible. 3) Be cautious with the suggested sudo sysctl change (iogpu.wired_limit_mb): it requires elevated privileges and changes system G...
详细分析 ▾
用途与能力
Name/description claim KV cache compression for llama.cpp on Apple Silicon; the SKILL.md and README exclusively describe using a TurboQuant llama.cpp fork, relevant CLI flags, and platform-specific tuning. No unrelated credentials, binaries, or services are requested.
指令范围
Instructions stay on-topic (clone/build the turboquant fork, run llama-server with cache-type flags). They also recommend a system-level change (sudo sysctl iogpu.wired_limit_mb) to raise GPU memory caps for large contexts — this is relevant to the stated goal but requires elevated privileges and modifies system state. No instructions collect or transmit user data to unexpected endpoints.
安装机制
The skill is instruction-only (no install spec), but its README instructs cloning and building a GitHub repository (TheTom/llama-cpp-turboquant). Downloading and compiling third-party code from GitHub is common for this domain but is a moderate operational risk if the repository is untrusted or has malicious contents. The skill itself does not provide an automated installer or opaque download URLs.
凭证需求
No environment variables, credentials, or config paths are requested. The requested actions (build/run a local server, sysctl) are proportionate to compressing KV caches for local inference.
持久化与权限
Skill does not request persistent inclusion (always: false) and does not attempt to modify other skills or agent-wide configs. It does recommend a one-off privileged sysctl change (requires sudo) which alters system GPU memory limits until reboot; this is a legitimate but privileged action and not an automatic persistent installation by the skill.
安全有层次,运行前请审查代码。

运行时依赖

无特殊依赖

版本

latestv1.0.02026/4/4

v1.0: TurboQuant+ KV缓存压缩指南,支持Apple Silicon本地LLM推理

可疑

安装命令

点击复制
官方npx clawhub@latest install turboquant-plus
镜像加速npx clawhub@latest install turboquant-plus --registry https://cn.longxiaskill.com
数据来源ClawHub ↗ · 中文优化:龙虾技能库