K1 OH5.0 AI Build and Development Instructions
Revision History
Revision Version | Revision Date | Revision Description |
001 | 2025-03-28 | Initial version |
002 | 2025-04-12 | Format optimization |
1. Prerequisites
Refer to the compilation documentation to complete system compilation and flashing: K1 OH5.0 Download, Compile, and Flash Instructions
1.1. ollama+deepseek Resource Preparation
Download link: Click to Download
deepseek-r1-distill-qwen-1.5b-q4_0.gguf
Modelfile
Ollama
deepSeek-r1-distill-qwen-1.5b-q4_0.gguf
A compressed and optimized model file. It uses the GGUF format, which is designed for efficient inference and compression. This format can run efficiently in resource-constrained environments, such as embedded or mobile devices.
modefile
Defines how to configure and use the deepseek-r1-distill-qwen-1.5b-q4_0.gguf
model file.
ollama
Runs and manages various machine learning models. It supports multiple model versions and formats, including the DeepSeek series. With Ollama, you can easily deploy, run, and manage the DeepSeek-R1-Distill-Qwen-1.5B-Q4_0.gguf model.
1.2. Environment and Tool Preparation
- One set of MUSE Paper and power supply
- Type-C cable (for flashing and hdc connection)
- Windows-side hdc (for transferring files between PC and board)
- IDE (DevEco 4.0)
- K1 OH5.0 build environment
2. Install ollama+deepseek-r1-1.5b
To help developers quickly experience, a one-click installation package is provided.
2.1. Connect Windows and Muse Paper with a Type-C Cable
Ensure hdc shell
can connect to MUSE Paper
D:\>hdc list targets
0123456789ABCDEF
2.2. Download the installation package to your Windows PC and extract it to any location
Download link: Click to Download (already downloaded above, can be ignored). The package includes the installer, programs needed for secondary development, and development manuals.
2.2.1. One-click Automatic Installation of deepseek
Double-click the installation script circled in red in the figure: setup_ohos_ollama_env_v1.0.bat
to install all LLM dependencies and applications for OH:
2.2.2. Run and Debug
After installation, the application can perform LLM Q&A.
- Run ollama, if the following is displayed, ollama is working properly:
- If the list is empty, it means the model is not installed and needs to be loaded
- Load the large model
- Check the model list again
- Run the large model for conversation in the command line
- Open the OH HAP application
- Test the HAP user interface
3. Secondary Development
3.1. Development Environment Preparation
-
OH system development: VSCode + ubuntu Linux server
-
HAP development: DevEco 4.0 【deveco-studio-4.1.0.400.exe】
-
Required development files: Click to Download (already downloaded above, can be ignored)
- chatgpt: oh chatgpt lib code + testNAPI code
- deepseek: demo HAP code
3.2. OH System Build
Place the chatgpt
folder into the oh5.0\foundation\communication\chatgpt
directory, configure the module build settings, and you can compile the corresponding library. This library provides the interface for the upper HAP to access ollama.
3.2.1. Edit Development Code
Modify the code as needed.
3.2.2. Compile the Image
Command:
./build.sh --product-name musepaper2 --ccache --prebuilt-sdk
The two libraries related to this project are:
- libchatgpt_napi.z.so
- libchatgpt_core.z.so
The newly compiled image contains these two so files, which can be flashed or pushed using the hdc file send command, as follows:
hdc file send libchatgpt_napi.z.so /lib64/module/
hdc file send libchatgpt_core.z.so /lib64/
3.3. HAP Test Project
OH5.0\foundation\communication\chatgpt\testNapi
is mainly for secondary developers to refer to when developing their own AI large model applications. Open the project with the following version of DevEco, compile to generate the testNapi HAP needed for testing, which can be used to test and help develop your own LLM applications.
3.4. Development and Debugging
3.4.1. View Logs
-
hdc shell higlog | grep Chatgpt
-
hdc shell hilog | grep Index
-
Set ollama debug:
- export OLLAMA_DEBUG=1 //enable log output
- export OLLAMA_HOST='0.0.0.0' //allow external access to OLLAMA
02-28 12:35:58.260 4086 4086 I C01650/ChatGPT: ChatGPT instance created
02-28 12:35:58.260 4086 4086 I C01650/ChatGPT: Generating streaming response for input: who are you
02-28 12:35:58.261 4086 7595 I C01650/ChatGPT: Request payload: {"model":"deepseek-r1-1.5b","prompt":"who are you","stream":true}
02-28 12:35:58.262 4086 7595 I C01650/ChatGPT: Making request to Ollama API at [http://localhost:11434/api/generate](http://localhost:11434/api/generate)
02-28 12:35:58.266 4086 7595 I C01650/ChatGPT: CURL request completed after 1 attempts
02-28 12:35:58.267 4086 7595 I C01650/ChatGPT: Request completed successfully
3.5. Demo HAP Project
Similarly, use DevEco Studio 4.1 Release to open the corresponding code project and compile the demo HAP.
4. oh+ollama+deepseek Design Description
4.1. Architecture
-
Frontend Layer (ArkTS)
- UI and business logic
-
Service Layer (ArkTS)
- Cross-NAPI callback implementation (ArkTS ↔ OS Native)
- Callback registration and management
- Business logic, API interaction, and data processing
-
NAPI Layer
- Interface between JavaScript/TypeScript and C++
- Parameter parsing and passing
- Callback registration
-
C++ Implementation Layer
- Core functionality and native API interaction
- napi_async_work implementation (to prevent main thread blocking and app main thread block crash)
- Cross-napi callback implementation (arkts <---> os native)
- ollama integration
- deepSeek integration
4.2. llm(chatgpt) Subsystem Component Design & Implementation
-
ChatGPTService uses singleton
-
C++ ChatGPT class uses singleton
-
Asynchronous processing
- NAPI layer uses
napi_async_work
- C++ layer uses
std::thread
- Smart pointers prevent memory leaks, increase robustness, replace
new
- UI layer uses real-time callbacks
- Stream processing
- Detailed log tracking
- NAPI layer uses
4.2.1. Call Flow
- User enters text in UI → triggers
onClick
event ChatGPTService
calls the NAPI module'sgenerateResponse
- NAPI layer converts parameters, creates async work
- C++ layer executes HTTP request, returns result via callback
- Result is returned to frontend via callback chain, frontend renders in real time
4.2.2. chatgpt_napi.cpp Design
Data structure:
struct AsyncCallbackData {
napi_env env; // NAPI environment
napi_ref streamCallbackRef; // Stream callback reference
napi_ref completionCallbackRef; // Completion callback reference
std::string chunk; // Data chunk
std::string result; // Result
napi_value resourceName; // Resource name
};
Callback handling
-
StreamCallbackComplete
: Handles stream data callback, processes when a data chunk appears- Get callback function reference
- Create parameter array
- Call JavaScript callback function
- Clean up resources
-
CompletionCallbackComplete
: Handles completion callback, processes when completed- Similar to stream callback process
- Additionally cleans up all callback references
Main interface function
napi_value GenerateResponse(napi_env env, napi_callback_info info) {
// Get parameters
// Create callback reference
// Set async work
// Call native method
}
Module initialization
napi_value Init(napi_env env, napi_value exports) {
// Register module methods
napi_property_descriptor desc[] = {
{ "generateResponse", nullptr, GenerateResponse, nullptr, nullptr, nullptr,
napi_default, nullptr }
};
napi_define_properties(env, exports, 1, desc);
return exports;
}
NAPI_MODULE(chatgpt_napi, Init)
Code flow:
- Module initialization
NAPI_MODULE(chatgpt_napi, Init) // Register module
↓
Init(napi_env env, napi_value exports) // Initialization function
↓
napi_define_properties // Register generateResponse method
ChatGPT initialization:
ChatGPT::ChatGPT()
↓
std::call_once(initFlag, [this]() {
InitializeCurl() // CURL global initialization
})
UI layer trigger:
// Click event in Index.ets
this.chatGPTService.generateResponseStream(
this.userInput,
(chunk: string) => { this.response += chunk },
(result: string) => { this.isLoading = false }
)
- Service layer processing:
// ChatGPTService.ets
public generateResponseStream(input: string, streamCallback, completionCallback): void {
this.nativeChatGPT.generateResponse(input, streamCallback, completionCallback)
}
- NAPI layer conversion:
// chatgpt_napi.cpp
napi_value GenerateResponse(napi_env env, napi_callback_info info) {
// Parameter conversion
// Create async work
OHOS::Communication::ChatGPT::GetInstance().GenerateResponseStream(
input, streamCallback, completionCallback);
}
- C++ core implementation:
// chatgpt.cpp
void ChatGPT::GenerateResponseStream(
const std::string& input,
StreamCallback streamCallback,
CompletionCallback completionCallback) {
// Execute HTTP request
// Handle stream response
}
5. FAQ
5.1. To ensure inference CPU resources are not monopolized, bind the CPU-consuming processes
taskset -p 240 $(pidof render_service)
taskset -p 240 $(pidof com.example.deepseek)
taskset -p 240 $(pidof com.example.testnapi)
Parameter description:
240 (hex 0xf0, binary 11110000) means CPU 4-7
Command description:
taskset -p 240 $(pidof render_service)
pidof render_service: Find the thread ID (PID) of the corresponding thread
taskset -p 240 [PID] binds the process to run on CPU 240 (binary: 11110000)
For productization, you can call the sched_setaffinity() function to set CPU binding
int sched_setaffinity(pid_t pid, size_t cpusetsize, const cpu_set_t *mask);
5.2. How to export ollama logs
5.3. MUSE Paper frequently turns off the screen
Set the screen to stay on
power-shell setmode 602
5.4. ollama cannot run
May be missing ld-linux-riscv64-lp64d.so.1
/lib/ld-linux-riscv64-lp64d.so.1
Copy this file to the corresponding directory in ohos and grant execute permission _chmod +x /lib/ld-linux-riscv64-lp64d.so.1_
5.5. Not enough space
Solution: Link the /.ollama directory to /data/deepseek/.ollama
ln -s /data/deepseek/.ollama /.ollama
5.6. How to input and display Chinese in Windows command line for debugging
Set CMD to support Chinese <in windows console>
To enable Chinese input and display in CMD Console:
chcp 65001:
chcp stands for "Change Code Page", used to change the current console code page. 65001 means UTF-8 encoding.
After executing this command, the CMD window will switch to UTF-8 encoding, and Chinese can be displayed and input normally.