Missing documentation for real-time NPU inference + video streaming with overlay (YOLO use case)

2025-11-02 12:17

Hello Rockchip / Luckfox team,
I have a RV1106-based Luckfox Pico board, and I’ve been following your official documentation
(Rockchip RV1106/RV1103 Developer Guide - Linux IPC Sample).
However, there is currently no example or documentation explaining the most basic use case for this platform:
Run real-time AI inference (YOLO, RKNN) on the NPU while streaming the video with bounding boxes drawn on top.
This is exactly what the RV1106 + RKNN hardware is designed for — and yet, the documentation only covers
basic video capture and encoding samples (sample_vi, sample_venc, sample_demo_vi_venc)
without any clear integration with the RKNN NPU or overlay pipeline.
Right now, developers are left guessing how to:
Capture video frames from VI/ISP and feed them to the NPU for inference.
Parse the RKNN outputs and draw bounding boxes (via RGN/TDE overlay).
Re-encode and stream the annotated video (RTSP, HTTP, etc.).
This is the core feature of any embedded vision system,
and it should have been the very first complete sample provided with the SDK.
Please provide:
A minimal example showing how to run YOLO (RKNN) inference on the NPU,
overlay the boxes, and stream the annotated video in real time.
Or at least a clear description of how to connect the RKNN and RGN/VENC modules
within the same pipeline.
Without that, the boards are practically unusable for real computer-vision applications.
Thank you,
Olivier

2025-11-04 2:25

Hello, please refer to https://wiki.luckfox.com/Luckfox-Pico-Pro-Max/MPI. We have provided a simple example. If you need a more performance-oriented example, you can refer to the source code of $sdk/project/app/rkipc/rkipc. Use rockiva to call the NPU for object detection (the Rockiva algorithm SDK requires purchasing an authorization from Rockchip)