🤖Have you ever tried Chat.M5Stack.com before asking??😎
    M5Stack Community
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    LLM module. how to use another llm model ?

    Scheduled Pinned Locked Moved Modules
    5 Posts 2 Posters 3.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E Offline
      erictiquet
      last edited by

      Hi Everyone,

      I am new to the module.
      Want to switch llm model as new are available
      ( ex: llama3.2-1b-prefill-ax630c or qwen2.5-1.5b-ax630c )
      in short can't succed to load other model

      Any suggestions ? or pointing to the proper documentation ? ( did not find any topic regarding changing model via Arduino )


      What I did so far ?

      • log into the llm module via serial
      • ip a
      • connect via ssh root@ip (via the ethernet compagnon board )
      • load my ssh public key and then ssh
      • proceed succesfully to install other models via apt-get install xxx
      • reboot (just in case )

      Then test via serial text ( via a M5STACK core grey, with a simple forward serial > serial2 app)
      the sequence :

      reset :

      { "request_id": "11212155", "work_id": "sys", "action": "reset" }
      {"created":1746310691,"data":"None","error":{"code":0,"message":"llm server restarting ..."},"object":"None","request_id":"11212155","work_id":"sys"}
      {"request_id": "0","work_id": "sys","created": 1746310696,"error":{"code":0, "message":"reset over"}}
      then...

      load model :

      { "request_id": "3", "work_id": "llm", "action": "setup","object": "llm.setup", "data": { "model": "qwen2.5-1.5b-ax630c", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
      {"created":1746310710,"data":"None","error":{"code":-5,"message":"Model loading failed."},"object":"None","request_id":"3","work_id":"llm"}

      but it works with...

      { "request_id": "3", "work_id": "llm", "action": "setup", "object": "llm.setup", "data": { "model": "qwen2.5-0.5B-prefill-20e", "response_format": "llm.utf-8.stream", "input": "llm.utf-8", "enoutput": true, "max_token_len": 256, "prompt": "You are a knowledgeable assistant capable of answering various questions and providing information." } }
      {"created":1746310813,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"3","work_id":"llm.1004"}

      1 Reply Last reply Reply Quote 0
      • kurikoK Offline
        kuriko
        last edited by

        @erictiquet
        have you check this:
        https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html

        Good morning, and welcome to the Black Mesa Transit System.

        1 Reply Last reply Reply Quote 0
        • E Offline
          erictiquet
          last edited by

          Hello Kuriko, Everyone,

          Found the solutions ;)

          The best help came from :

          • chat.m5stack.com ( support you guys for making it stable )
          • chatgtp to write a little arduino ino code that display a web page. ( you've got the code below )
          • and the following page for the llm json syntax : ( that's what is exchange in the dialog )
            https://github.com/m5stack/StackFlow/blob/main/doc/projects_llm_framework_doc/llm_llm_en.md

          ** First install the new models, they should appear in /opt/m5stack/data/

          Connect via SSH: (as a normal linux server)

          • require that you plug the RJ45 and access the debug port via serial and type "ip a " to get the IP, or that you can sniff your dhcp server
          • to be safe and ease the work, suggest you upload your ssh-key on the llm module ( ssh-copy-id )
          • the default login is root@<your ip> and password "123456", change it after loading you key successfully.

          **Then install the new models : (for example)
          apt-get install llm-model-llama3.2-1b-prefill-ax630c llm-model-qwen2.5-1.5b-p256-ax630c

          you should see something like :

          root@m5stack-LLM:/# ls -la /opt/m5stack/data
          total 68
          drwxrwxr-x 17 root root 4096 May 4 07:23 .
          drwxrwxr-x 7 root root 4096 Feb 20 21:24 ..
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 audio
          drwxrwxr-x 3 1000 1000 4096 May 4 04:48 llama3.2-1B-prefill-ax630c
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 melotts_zh-cn
          drwxrwxr-x 2 root root 4096 May 4 07:25 models
          drwxrwxr-x 2 1000 1000 4096 May 4 05:50 qwen2.5-0.5B-prefill-20e
          drwxr-xr-x 3 1000 1000 4096 May 4 04:49 qwen2.5-1.5B-p256-ax630c
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-20M-2023-02-17
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_english_fast
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 single_speaker_fast
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-pose
          drwxrwxr-x 2 root root 4096 Dec 5 17:03 yolo11n-seg

          watch out the mmc space with "df" command", with 2 more models you are reaching 74% of the avail space.
          (another topic to tackle how to use and sdcard for additional storage space.. )

          root@m5stack-LLM:/# df
          Filesystem 1K-blocks Used Available Use% Mounted on
          /dev/root 29289340 21660980 7611976 74% /
          tmpfs 490876 0 490876 0% /dev/shm
          tmpfs 196352 876 195476 1% /run
          tmpfs 5120 0 5120 0% /run/lock
          tmpfs 490876 0 490876 0% /tmp
          /dev/mmcblk1p1 30554112 3424 30550688 1% /mnt/mmcblk1p1
          tmpfs 98172 0 98172 0% /run/user/0
          root@m5stack-LLM:/#

          Then to use it, just name the llm name with the name of the model install in the folder
          ex : llama3.2-1B-prefill-ax630c
          To play with the model you could use ino page and enter the following json :

          {
          "request_id": "2",
          "work_id": "llm",
          "action": "setup",
          "object": "llm.setup",
          "data": {
          "model": "llama3.2-1B-prefill-ax630c",
          "response_format": "llm.utf-8.stream",
          "input": "llm.utf-8",
          "enoutput": true,
          "max_token_len": 256,
          "prompt": "You are a helpful AI assistant."
          }
          }

          should receive the following return code like :

          {"created":1746846795,"data":"None","error":{"code":0,"message":""},"object":"None","request_id":"2","work_id":"llm.1004"}

          pick the last value like "llm.xxxx" and create a prompt :

          {
          "request_id": "2",
          "work_id": "llm.xxxx",
          "action": "inference",
          "object": "llm.utf-8.stream",
          "data": {
          "delta": "What's ur name?",
          "index": 0,
          "finish": true
          }
          }

          then you will see something like... :

          {"created":1746846972,"data":{"delta":"I'm an","finish":false,"index":0},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846973,"data":{"delta":" artificial intelligence model","finish":false,"index":1},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846974,"data":{"delta":" known as L","finish":false,"index":2},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846974,"data":{"delta":"lama. L","finish":false,"index":3},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846975,"data":{"delta":"lama stands for","finish":false,"index":4},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846976,"data":{"delta":" "Large Language","finish":false,"index":5},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846976,"data":{"delta":" Model Meta AI","finish":false,"index":6},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846977,"data":{"delta":"."","finish":false,"index":7},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}
          {"created":1746846977,"data":{"delta":"","finish":true,"index":8},"error":{"code":0,"message":""},"object":"llm.utf-8.stream","request_id":"2","work_id":"llm.1004"}

          et voilà.

          1 Reply Last reply Reply Quote 0
          • E Offline
            erictiquet
            last edited by

            the .ino code I used :

            #include <Arduino.h>
            #include <M5Unified.h>
            #include <WiFi.h>
            #include <WebServer.h>

            // ⚙️ Configuration WiFi
            const char* ssid = "you ssid";
            const char* password = "you password";

            // UART2 pour le module LLM
            HardwareSerial LLM(2); // GPIO16 = RX, GPIO17 = TX

            WebServer server(80);
            String lastSerialMessage = "";

            // 🔐 Encodage simple pour éviter les problèmes d’affichage HTML
            String htmlEscape(String text) {
            text.replace("&", "&");
            text.replace("<", "<");
            text.replace(">", ">");
            text.replace(""", """);
            text.replace("'", "'");
            return text;
            }

            // 💻 Page HTML dynamique
            String getHTMLPage() {
            String html = R"rawliteral(
            <!DOCTYPE html>
            <html>
            <head>
            <meta charset="UTF-8">
            <title>M5Core Web Serial</title>
            </head>
            <body>
            <h1>M5Core Web Serial</h1>
            <textarea id="msg" rows="15" cols="70" placeholder="Votre message ici..."></textarea><br>
            <button onclick="sendMessage()">Envoyer</button>
            <button onclick="clearOutput()">Effacer</button>
            <p><strong>Réponse série :</strong></p>
            <pre id="lastMessage" style="background:#eee; padding:10px; border:1px solid #ccc;"></pre>

              <script>
                function sendMessage() {
                  const msg = document.getElementById("msg").value;
                  fetch("/send", {
                    method: "POST",
                    headers: { "Content-Type": "application/x-www-form-urlencoded" },
                    body: "msg=" + encodeURIComponent(msg)
                  }).then(response => response.text())
                    .then(text => {
                      document.getElementById("lastMessage").innerText = text;
                    });
                }
            
                function clearOutput() {
                  fetch("/clear").then(r => r.text()).then(txt => {
                    document.getElementById("lastMessage").innerText = "";
                  });
                }
            
                setInterval(() => {
                  fetch("/last").then(r => r.text()).then(txt => {
                    document.getElementById("lastMessage").innerText = txt;
                  });
                }, 2000);
              </script>
            </body>
            </html>
            

            )rawliteral";
            html.replace("%LAST_MESSAGE%", htmlEscape(lastSerialMessage));
            return html;
            }

            void handleRoot() {
            server.send(200, "text/html", getHTMLPage());
            }

            void handleSend() {
            if (server.hasArg("msg")) {
            String msg = server.arg("msg");
            LLM.println(msg);
            lastSerialMessage = "Envoyé : " + msg;
            server.send(200, "text/plain", "Envoyé : " + msg);
            } else {
            server.send(400, "text/plain", "Argument 'msg' manquant");
            }
            }

            void handleLast() {
            server.send(200, "text/plain", lastSerialMessage);
            }

            void handleClear() {
            lastSerialMessage = "";
            server.send(200, "text/plain", "Effacé");
            }

            void setup() {
            M5.begin();
            M5.Lcd.setTextSize(2);
            M5.Lcd.println("Initialisation...");

            Serial.begin(115200); // PC USB
            LLM.begin(115200, SERIAL_8N1, 16, 17); // RX2, TX2

            WiFi.begin(ssid, password);
            M5.Lcd.print("Connexion WiFi");
            while (WiFi.status() != WL_CONNECTED) {
            delay(500);
            M5.Lcd.print(".");
            }

            M5.Lcd.println("\nConnecté");
            M5.Lcd.println(WiFi.localIP());

            server.on("/", handleRoot);
            server.on("/send", HTTP_POST, handleSend);
            server.on("/last", handleLast);
            server.on("/clear", handleClear);
            server.begin();
            M5.Lcd.println("Serveur web actif !");
            }

            void loop() {
            server.handleClient();

            if (Serial.available()) {
            char c = Serial.read();
            LLM.write(c);
            Serial.print(c);
            }

            if (LLM.available()) {
            char c = LLM.read();
            Serial.print(c);
            lastSerialMessage += c;
            M5.Lcd.print(c);

            if (lastSerialMessage.length() > 20000) {
              lastSerialMessage = lastSerialMessage.substring(lastSerialMessage.length() - 20000);
            }
            

            }
            }

            1 Reply Last reply Reply Quote 0
            • E Offline
              erictiquet
              last edited by

              screen capture M5core Webserial.png

              1 Reply Last reply Reply Quote 0

              Hello! It looks like you're interested in this conversation, but you don't have an account yet.

              Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

              With your input, this post could be even better 💗

              Register Login
              • First post
                Last post