LLM module. how to use another llm model ?
-
Hi Everyone,
I am new to the module.
Want to switch llm model as new are available
( ex: llama3.2-1b-prefill-ax630c or qwen2.5-1.5b-ax630c )
in short can't succed to load other modelAny suggestions ? or pointing to the proper documentation ? ( did not find any topic regarding changing model via Arduino )
What I did so far ?
- log into the llm module via serial
- ip a
- connect via ssh root@ip (via the ethernet compagnon board )
- load my ssh public key and then ssh
- proceed succesfully to install other models via apt-get install xxx
- reboot (just in case )
Then test via serial text ( via a M5STACK core grey, with a simple forward serial > serial2 app)
the sequence :reset :
{ "request_id": "11212155", "work_id": "sys", "action": "reset" }
{"created":1746310691,"data":"None","error":{"code":0,"message":"llm server restarting ..."},"object":"None","request_id":"11212155","work_id":"sys"}
{"request_id": "0","work_id": "sys","created": 1746310696,"error":{"code":0, "message":"reset over"}}
then...load model :
{ "request_id": "3", "work_id": "llm", "action": "setup","object": "llm.setup", "data": { "model": "qwen2.5-1.5b-ax630c", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
{"created":1746310710,"data":"None","error":{"code":-5,"message":"Model loading failed."},"object":"None","request_id":"3","work_id":"llm"}but it works with...
{ "request_id": "3", "work_id": "llm", "action": "setup", "object": "llm.setup", "data": { "model": "qwen2.5-0.5B-prefill-20e", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
{"created":1746310813,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"3","work_id":"llm.1004"} -
@erictiquet
have you check this:
https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html -
Hello Kuriko, Everyone,
Found the solutions ;)
The best help came from :
- chat.m5stack.com ( support you guys for making it stable )
- chatgtp to write a little arduino ino code that display a web page. ( you've got the code below )
- and the following page for the llm json syntax : ( that's what is exchange in the dialog )
https://github.com/m5stack/StackFlow/blob/main/doc/projects_llm_framework_doc/llm_llm_en.md
** First install the new models, they should appear in /opt/m5stack/data/
Connect via SSH: (as a normal linux server)
- require that you plug the RJ45 and access the debug port via serial and type "ip a " to get the IP, or that you can sniff your dhcp server
- to be safe and ease the work, suggest you upload your ssh-key on the llm module ( ssh-copy-id )
- the default login is root@<your ip> and password "123456", change it after loading you key successfully.
**Then install the new models : (for example)
apt-get install llm-model-llama3.2-1b-prefill-ax630c llm-model-qwen2.5-1.5b-p256-ax630cyou should see something like :
root@m5stack-LLM:/# ls -la /opt/m5stack/data
total 68
drwxrwxr-x 17 root root 4096 May 4 07:23 .
drwxrwxr-x 7 root root 4096 Feb 20 21:24 ..
drwxrwxr-x 2 root root 4096 Dec 5 17:03 audio
drwxrwxr-x 3 1000 1000 4096 May 4 04:48 llama3.2-1B-prefill-ax630c
drwxrwxr-x 2 root root 4096 Dec 5 17:03 melotts_zh-cn
drwxrwxr-x 2 root root 4096 May 4 07:25 models
drwxrwxr-x 2 1000 1000 4096 May 4 05:50 qwen2.5-0.5B-prefill-20e
drwxr-xr-x 3 1000 1000 4096 May 4 04:49 qwen2.5-1.5B-p256-ax630c
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-20M-2023-02-17
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01
drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01
drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_english_fast
drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_fast
drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n
drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-pose
drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-segwatch out the mmc space with "df" command", with 2 more models you are reaching 74% of the avail space.
(another topic to tackle how to use and sdcard for additional storage space.. )root@m5stack-LLM:/# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 29289340 21660980 7611976 74% /
tmpfs 490876 0 490876 0% /dev/shm
tmpfs 196352 876 195476 1% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 490876 0 490876 0% /tmp
/dev/mmcblk1p1 30554112 3424 30550688 1% /mnt/mmcblk1p1
tmpfs 98172 0 98172 0% /run/user/0
root@m5stack-LLM:/#Then to use it, just name the llm name with the name of the model install in the folder
ex : llama3.2-1B-prefill-ax630c
To play with the model you could use ino page and enter the following json :{
"request_id": "2",
"work_id": "llm",
"action": "setup",
"object": "llm.setup",
"data": {
"model": "llama3.2-1B-prefill-ax630c",
"response_format": "llm.utf-8.stream",
"input": "llm.utf-8",
"enoutput": true,
"max_token_len": 256,
"prompt": "You are a helpful AI assistant."
}
}should receive the following return code like :
{"created":1746846795,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"2","work_id":"llm.1004"}
pick the last value like "llm.xxxx" and create a prompt :
{
"request_id": "2",
"work_id": "llm.xxxx",
"action": "inference",
"object": "llm.utf-8.stream",
"data": {
"delta": "What's ur name?",
"index": 0,
"finish": true
}
}then you will see something like... :
{"created":1746846972,"data":{"delta":"I'm an","finish":false,"index":0},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846973,"data":{"delta":" artificial intelligence model","finish":false,"index":1},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846974,"data":{"delta":" known as L","finish":false,"index":2},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846974,"data":{"delta":"lama. L","finish":false,"index":3},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846975,"data":{"delta":"lama stands for","finish":false,"index":4},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846976,"data":{"delta":" "Large Language","finish":false,"index":5},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846976,"data":{"delta":" Model Meta AI","finish":false,"index":6},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846977,"data":{"delta":"."","finish":false,"index":7},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
{"created":1746846977,"data":{"delta":"","finish":true,"index":8},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}et voilà.
-
the .ino code I used :
#include <Arduino.h>
#include <M5Unified.h>
#include <WiFi.h>
#include <WebServer.h>// ⚙️ Configuration WiFi
const char* ssid = "you ssid";
const char* password = "you password";// UART2 pour le module LLM
HardwareSerial LLM(2); // GPIO16 = RX, GPIO17 = TXWebServer server(80);
String lastSerialMessage = "";// 🔐 Encodage simple pour éviter les problèmes d’affichage HTML
String htmlEscape(String text) {
text.replace("&", "&");
text.replace("<", "<");
text.replace(">", ">");
text.replace(""", """);
text.replace("'", "'");
return text;
}// 💻 Page HTML dynamique
String getHTMLPage() {
String html = R"rawliteral(
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>M5Core Web Serial</title>
</head>
<body>
<h1>M5Core Web Serial</h1>
<textarea id="msg" rows="15" cols="70" placeholder="Votre message ici..."></textarea><br>
<button onclick="sendMessage()">Envoyer</button>
<button onclick="clearOutput()">Effacer</button>
<p><strong>Réponse série :</strong></p>
<pre id="lastMessage" style="background:#eee; padding:10px; border:1px solid #ccc;"></pre><script> function sendMessage() { const msg = document.getElementById("msg").value; fetch("/send", { method: "POST", headers: { "Content-Type": "application/x-www-form-urlencoded" }, body: "msg=" + encodeURIComponent(msg) }).then(response => response.text()) .then(text => { document.getElementById("lastMessage").innerText = text; }); } function clearOutput() { fetch("/clear").then(r => r.text()).then(txt => { document.getElementById("lastMessage").innerText = ""; }); } setInterval(() => { fetch("/last").then(r => r.text()).then(txt => { document.getElementById("lastMessage").innerText = txt; }); }, 2000); </script> </body> </html>)rawliteral";
html.replace("%LAST_MESSAGE%", htmlEscape(lastSerialMessage));
return html;
}void handleRoot() {
server.send(200, "text/html", getHTMLPage());
}void handleSend() {
if (server.hasArg("msg")) {
String msg = server.arg("msg");
LLM.println(msg);
lastSerialMessage = "Envoyé : " + msg;
server.send(200, "text/plain", "Envoyé : " + msg);
} else {
server.send(400, "text/plain", "Argument 'msg' manquant");
}
}void handleLast() {
server.send(200, "text/plain", lastSerialMessage);
}void handleClear() {
lastSerialMessage = "";
server.send(200, "text/plain", "Effacé");
}void setup() {
M5.begin();
M5.Lcd.setTextSize(2);
M5.Lcd.println("Initialisation...");Serial.begin(115200); // PC USB
LLM.begin(115200, SERIAL_8N1, 16, 17); // RX2, TX2WiFi.begin(ssid, password);
M5.Lcd.print("Connexion WiFi");
while (WiFi.status() != WL_CONNECTED) {
delay(500);
M5.Lcd.print(".");
}M5.Lcd.println("\nConnecté");
M5.Lcd.println(WiFi.localIP());server.on("/", handleRoot);
server.on("/send", HTTP_POST, handleSend);
server.on("/last", handleLast);
server.on("/clear", handleClear);
server.begin();
M5.Lcd.println("Serveur web actif !");
}void loop() {
server.handleClient();if (Serial.available()) {
char c = Serial.read();
LLM.write(c);
Serial.print(c);
}if (LLM.available()) {
char c = LLM.read();
Serial.print(c);
lastSerialMessage += c;
M5.Lcd.print(c);if (lastSerialMessage.length() > 20000) { lastSerialMessage = lastSerialMessage.substring(lastSerialMessage.length() - 20000); }}
} -

Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login