高通 QCS6490 平台上 YOLOv10 系列模型的性能测试

企业官方账号

2025-07-19 10:36:27

前言

随着边缘智能计算与计算机视觉技术的深度融合，终端设备对实时感知、智能决策的需求日益迫切，而高性能硬件平台与高效 AI 模型的协同成为推动这一领域突破的核心动力。高通 QCS6490 平台作为边缘计算领域的标杆性产品，凭借其卓越的综合性能成为众多智能终端的理想选择。该平台采用先进的 6nm 制程工艺，搭载八核 Kryo 670 CPU（包含 4 个高性能 Cortex-A78 核心与 4 个能效比优异的 Cortex-A55 核心），在实现 2.7GHz 高频算力的同时，保持了出色的功耗平衡；其集成的第 6 代高通 AI Engine，配合 Hexagon 处理器与融合 AI 加速器，可提供高达 12 TOPS 的 AI 算力，为复杂模型的实时推理提供强大支撑。此外，QCS6490 支持企业级 Wi-Fi 6/6E，具备多千兆位的数据传输速率与超低延迟特性，搭配高性能三重 ISP（可支持 5 路摄像头并发及 192MP 图像捕捉），为多源视觉数据的高效处理与传输奠定了坚实基础。

在机器人与无人机领域，QCS6490 的性能优势得以充分释放。在机器人领域，其强大的实时计算能力可快速解析激光雷达、摄像头等传感器采集的环境数据，结合 SLAM 算法与 AI 模型实现精准的障碍物规避、动态路径规划；在工业场景下，能够驱动视觉系统对生产线产品进行高速质检，凭借高算力支持的缺陷检测模型，在毫秒级时间内完成产品表面瑕疵、尺寸偏差等问题的识别。在无人机领域，QCS6490 赋能无人机突破传统飞行限制，在电力巡检中，可通过多摄像头同步采集输电线路图像，实时运行目标检测与缺陷识别模型，精准定位绝缘子破损、导线断股等隐患；在农业植保场景下，能快速处理农田航拍图像，识别作物长势、病虫害区域，并结合飞行控制系统实现变量施药。

作为目标检测领域的前沿模型，YOLOv10 系列在继承 YOLO 系列优势的基础上，通过创新的网络架构与训练策略，实现了性能的显著提升。该系列模型引入了新的实时目标检测方法，通过消除非最大抑制（NMS）和优化各种模型组件，降低了计算开销，提升了检测效率与精度，在自动驾驶、安防监控、机器人导航等对响应速度要求严苛的场景中展现出广阔的应用前景。

鉴于高通 QCS6490 平台在边缘计算领域的硬件优势，以及 YOLOv10 系列模型在视觉任务中的技术领先性，二者的协同性能对机器人、无人机等终端设备的智能化升级具有关键影响。本文聚焦于高通 QCS6490 平台上 YOLOv10 系列模型的性能测试，旨在通过系统的实验与分析，揭示不同尺度 YOLOv10 模型在该平台上的推理速度、精度表现及资源占用情况，为边缘智能设备的模型选型与部署优化提供数据支撑，进而推动视觉 AI 技术在机器人交互、无人机巡检等场景的深度落地。

高通6490硬件介绍

深度解析 QCS6490：硬件性能全揭秘-CSDN博客

YOLOv10模型性能指标

YOLOv10系列性能指标-QCS6490
模型尺寸640*640	CPU		NPU QNN2.31
模型尺寸640*640	FP32		INT8
YOLOv10n	207.12 ms	4.83 FPS	5.53 ms	180.83 FPS
YOLOv10s	492.82 ms	2.03 FPS	7.98 ms	125.31 FPS
YOLOv10m	1219.93 ms	0.82 FPS	16.41 ms	60.94 FPS
YOLOv10b	1766.49 ms	0.57 FPS	20.24 ms	49.41 FPS
YOLOv10L	2187.53 ms	0.46 FPS	25.17 ms	39.73 FPS
YOLOv10x	2779.64 ms	0.36 FPS	38.5 ms	25.97 FPS

点击链接可以下载YOLOv10系列模型的pt格式，其他模型尺寸可以通过AIMO转换模型，并修改下面参考代码中的model_size测试即可。

（一）将pt模型转换为onnx格式

Step1：升级pip版本为25.1.1

python3.10 -m pip install --upgrade pip
pip -V
aidlux@aidlux:~/aidcode$ pip -V
pip 25.1.1 from /home/aidlux/.local/lib/python3.10/site-packages/pip (python 3.10)

Step2：安装ultralytics和onnx

pip install ultralytics onnx

Step3:设置yolo命令的环境变量

方法 1：临时添加环境变量（立即生效）

在终端中执行以下命令，将 ~/.local/bin 添加到当前会话的环境变量中

export PATH="$PATH:$HOME/.local/bin"

说明：此操作仅对当前终端会话有效，关闭终端后失效。
验证：执行 yolo --version，若输出版本号（如 0.0.2），则说明命令已生效。

方法 2：永久添加环境变量（长期有效）

echo 'export PATH="$PATH:$HOME/.local/bin"' >> ~/.bashrc
source ~/.bashrc  # 使修改立即生效

验证：执行 yolo --version，若输出版本号（如 0.0.2），则说明命令已生效。

测试环境中安装yolo版本为8.3.152

提示：如果遇到用户组权限问题，可以忽悠，因为yolo命令会另外构建临时文件，也可以执行下面命令更改用户组，执行后下面的警告会消失：

sudo chown -R aidlux:aidlux ~/.config/
sudo chown -R aidlux:aidlux ~/.config/Ultralytics

可能遇见的报错如下：

WARNING ⚠️ user config directory '/home/aidlux/.config/Ultralytics' is not writeable, defaulting to '/tmp' or CWD.Alternatively you can define a YOLO_CONFIG_DIR environment variable for this path.

Step4：将Yolov10系列模型的pt格式转换为onnx格式

新建一个python文件，命名自定义即可，用于模型转换以及导出：

from ultralytics import YOLO

# 加载同级目录下的.pt模型文件
model = YOLO('yolo10n.pt')  # 替换为实际模型文件名

# 导出ONNX配置参数
export_params = {
    'format': 'onnx',
    'opset': 12,          # 推荐算子集版本
    'simplify': True,     # 启用模型简化
    'dynamic': False,     # 固定输入尺寸
    'imgsz': 640,         # 标准输入尺寸
    'half': False         # 保持FP32精度
}

# 执行转换并保存到同级目录
model.export(**export_params)

执行该程序完成将pt模型导出为onnx模型。

python convert_yolov10.py #这个python文件为上面所命名的py文件

提示:Yolov10s,Yolov10m,Yolov10b,Yolov10l,Yolov10x替换代码中Yolo10n即可；

（二）使用AIMO将onnx模型转换高通NPU可以运行的模型格式

Step1：选择模型优化，模型格式选择onnx格式上传模型

Step2：选择芯片型号以及目标框架，这里我们选择QCS6490+Qnn2.31

Step3：点击查看模型，使用Netron查看模型结构，进行输入输出的填写

使用Netron工具查看onnx模型结构，选择剪枝位置

如上图点击Transpose，复制OUTPUTS输出的/model.23/Transpose_output_0

参考上图中红色框部分填写，其他不变，注意开启自动量化功能，AIMO更多操作查看使用说明或参考AIMO平台

Step4：接下来进行提交即可，转换完成后将目标模型文件下载，解压缩后其中的.bin.aidem文件即为模型文件

QNN测试代码

这个代码实现了基于 YOLOv10 模型的目标检测功能，主要包含以下几个部分：

数据准备：
- classes 列表定义了 COCO 数据集中的 80 个类别名称
- letterbox 函数用于调整图像大小并添加边界填充，以适应模型输入要求
图像处理工具：
- Colors 类用于生成不同类别的边界框颜色
- rescale_coords 函数将检测框坐标从模型输入尺寸缩放回原始图像尺寸
模型处理：
- preprocess 函数对输入图像进行预处理，包括调整大小、归一化等
- postprocess 函数处理模型输出，过滤低置信度预测并缩放坐标
模型推理引擎：
- qnn_yolov10 类封装了基于 QNN SDK 的模型推理
主程序：
- main 函数实现了完整的推理流程：加载模型、预处理图像、推理、后处理、可视化结果
- parser_args 函数解析命令行参数，允许用户自定义模型路径、输入图像和推理次数

整个代码通过结合 QNN SDK ，实现了 YOLOv10 模型的高效推理和目标检测功能。

import time
import numpy as np
import cv2
import os
import aidlite
import argparse
import onnxruntime  

"""返回 COCO 数据集的类别名称（80 类）。"""
classes=[
        "person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", "boat",
        "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat",
        "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack",
        "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball",
        "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket",
        "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
        "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair",
        "sofa", "pottedplant", "bed", "diningtable", "toilet", "tvmonitor", "laptop", "mouse",
        "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator",
        "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"
    ]

def letterbox(
        im,
        new_shape,
        color=(114, 114, 114),
        auto=False,
        scaleFill=False,
        scaleup=True,
        stride=32,
):
    """
    调整图像大小并填充图像，同时满足步长约束
    参数:
        im (array): 输入图像 (height, width, 3)
        new_shape (tuple): 目标尺寸 (height, width)
        color (tuple): 填充颜色
        auto (bool): 是否自动调整为最小矩形
        scaleFill (bool): 是否拉伸图像
        scaleup (bool): 是否允许放大图像
        stride (int): 步长，用于自动调整
    返回:
        im (array): 处理后的图像 (height, width, 3)
        ratio (array): [w_ratio, h_ratio]
        (dw, dh) (array): [w_padding h_padding]
    """
    shape = im.shape[:2]  # 当前图像形状 [height, width]
    if isinstance(new_shape, int):  # 如果只提供一个整数，则为正方形
        new_shape = (new_shape, new_shape)
 
    # 计算缩放比例 (新尺寸 / 旧尺寸)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # 只缩小，不放大（以获得更好的验证mAP）
        r = min(r, 1.0)
 
    # 计算填充量
    ratio = r, r  # 宽高比例
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))  # 新的未填充尺寸 [width, height]
    dw, dh = (
        new_shape[1] - new_unpad[0],
        new_shape[0] - new_unpad[1],
    )  # 宽高填充量
 
    if auto:  # 最小矩形填充
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # 确保填充量是步长的倍数
    elif scaleFill:  # 拉伸填充
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])  # 直接设置为目标尺寸
        ratio = (
            new_shape[1] / shape[1],
            new_shape[0] / shape[0],
        )  # 宽高比例
 
    dw /= 2  # 将填充量分为两侧
    dh /= 2
    if shape[::-1] != new_unpad:  # 如果需要调整大小
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))  # 上下填充量
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))  # 左右填充量
    im = cv2.copyMakeBorder(
        im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color
    )  # 添加边框填充
    return im, ratio, (dw, dh)

class Colors:
    """生成用于绘制边界框的颜色"""
    
    def __init__(self):
        # 预定义的颜色列表（十六进制）
        hexs = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB',
                '2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7')
        self.palette = [self.hex2rgb(f'#{c}') for c in hexs]  # 转换为RGB颜色
        self.n = len(self.palette)  # 颜色数量

    def __call__(self, i, bgr=False):
        """根据索引获取颜色"""
        c = self.palette[int(i) % self.n]  # 循环使用颜色列表
        return (c[2], c[1], c[0]) if bgr else c  # 如果需要BGR格式则调整顺序

    @staticmethod
    def hex2rgb(h):
        """将十六进制颜色转换为RGB元组"""
        return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))

def rescale_coords(boxes, image_shape, input_shape):
    """
    将检测框坐标从输入尺寸缩放回原始图像尺寸
    参数:
        boxes (array): 检测框坐标 [x1, y1, x2, y2]
        image_shape (tuple): 原始图像尺寸 (height, width)
        input_shape (tuple): 输入模型的尺寸 (height, width)
    返回:
        boxes (array): 缩放后的检测框坐标
    """
    image_height, image_width = image_shape
    input_height, input_width = input_shape
    scale = min(input_width / image_width, input_height / image_height)  # 计算缩放比例
    pad_w = (input_width - image_width * scale) / 2  # 宽度方向的填充量
    pad_h = (input_height - image_height * scale) / 2  # 高度方向的填充量
    boxes[:, [0, 2]] = (boxes[:, [0, 2]] - pad_w) / scale  # 调整x坐标
    boxes[:, [1, 3]] = (boxes[:, [1, 3]] - pad_h) / scale  # 调整y坐标
    boxes[:, [0, 2]] = np.clip(boxes[:, [0, 2]], 0, image_width)  # 确保坐标在图像范围内
    boxes[:, [1, 3]] = np.clip(boxes[:, [1, 3]], 0, image_height)  # 确保坐标在图像范围内
    return boxes.astype(int)  # 转换为整数坐标

def preprocess(image, input_shape):
    """
    图像预处理函数，为模型推理做准备
    参数:
        image (array): 输入图像
        input_shape (tuple): 模型输入尺寸 (height, width)
    返回:
        blob (array): 预处理后的图像数据
    """
    # 调整图像大小并填充
    input_img = letterbox(image, input_shape)[0]
    # BGR转RGB
    input_img = input_img[..., ::-1]
    # 添加批次维度
    input_img = input_img[np.newaxis, :, :, :].astype(np.float32)
    # 确保内存连续
    input_img = np.ascontiguousarray(input_img)
    # 归一化处理
    blob = input_img / 255.0
    return blob

def postprocess(output_data, conf_thres, image_shape, input_shape):
    """
    后处理函数，处理模型输出结果
    参数:
        output_data (array): 模型输出数据
        conf_thres (float): 置信度阈值
        image_shape (tuple): 原始图像尺寸 (height, width)
        input_shape (tuple): 输入模型尺寸 (height, width)
    返回:
        boxes (array): 检测框坐标 [x1, y1, x2, y2]
        scores (array): 置信度分数
        labels (array): 类别标签
    """
    outs = output_data  # test.py 中 output_data 已经是 (8400, 84)
    outs = outs[outs[:, 4] >= conf_thres]  # 过滤低于置信度阈值的预测
    boxes = outs[:, :4]  # 提取边界框坐标
    scores = outs[:, -2]  # 提取置信度分数
    labels = outs[:, -1].astype(int)  # 提取类别标签
    boxes = rescale_coords(boxes, image_shape, input_shape)  # 缩放坐标回原始图像尺寸
    return boxes, scores, labels

class qnn_yolov10:
    """QNN SDK的YOLOv10模型推理类"""
    
    def __init__(self,model_path,sdk="qnn",backend="npu"):
        """
        初始化QNN YOLOv10模型
        参数:
            model_path (str): 模型路径
            sdk (str): SDK类型，"qnn"或其他
            backend (str): 推理后端，"npu"、"gpu"或"cpu"
        """
        self.config = aidlite.Config.create_instance()
        if self.config is None:
            print("Create config failed !")
            return False

        self.config.implement_type = aidlite.ImplementType.TYPE_LOCAL
        if sdk.lower()=="qnn":
            self.config.framework_type = aidlite.FrameworkType.TYPE_QNN
        else :
            self.config.framework_type = aidlite.FrameworkType.TYPE_SNPE2
            
        # 设置加速类型
        if backend.lower() =="npu":
            self.config.accelerate_type = aidlite.AccelerateType.TYPE_DSP
        elif backend.lower() =="gpu":
            self.config.accelerate_type = aidlite.AccelerateType.TYPE_GPU
        else:
            self.config.accelerate_type = aidlite.AccelerateType.TYPE_CPU
        self.config.is_quantify_model = 1  # 设置为量化模型
               
        self.model = aidlite.Model.create_instance(model_path)
        if self.model is None:
            print("Create model failed !")
            return False
        self.interpreter = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(self.model, self.config)
        if self.interpreter is None:
            print("build_interpretper_from_model_and_config failed !")
            return None
        result = self.interpreter.init()
        if result != 0:
            print(f"interpreter init failed !")
            return False
        result = self.interpreter.load_model()
        if result != 0:
            print("interpreter load model failed !")
            return False
        print("detect model load success!")
    
    def __del__(self):
        """释放资源"""
        self.interpreter.destory()
    
    def __call__(self, img_input,invoke_nums):
        """
        执行模型推理
        参数:
            img_input (array): 输入图像数据
            invoke_nums (int): 推理次数，用于性能测试
        返回:
            output1 (array): 模型输出结果
        """
        result = self.interpreter.set_input_tensor(0, img_input.data)
        if result != 0:
            print("interpreter set_input_tensor() failed")
        invoke_time=[]
        for i in range(invoke_nums):
            t1=time.time()
            result = self.interpreter.invoke()
            if result != 0:
                print("interpreter set_input_tensor() failed")
            cost_time = (time.time()-t1)*1000
            invoke_time.append(cost_time)

        max_invoke_time = max(invoke_time)
        min_invoke_time = min(invoke_time)
        mean_invoke_time = sum(invoke_time)/invoke_nums
        var_invoketime=np.var(invoke_time)
        print("====================================")
        print(f"QNN invoke {invoke_nums} times:\n --mean_invoke_time is {mean_invoke_time} \n --max_invoke_time is {max_invoke_time} \n --min_invoke_time is {min_invoke_time} \n --var_invoketime is {var_invoketime}")
        print("====================================")

        output1 = self.interpreter.get_output_tensor(0)
        return output1
        
    
class onnx_yolov10:
    """ONNX运行时的YOLOv10模型推理类"""
    
    def __init__(self,model_path):
        """
        初始化ONNX YOLOv10模型
        参数:
            model_path (str): 模型路径
        """
        self.sess_options = onnxruntime.SessionOptions()
        self.sess_options.intra_op_num_threads = 1  # 设置线程数
        self.sess = onnxruntime.InferenceSession(model_path,sess_options=self.sess_options)
        self.outname = [i.name for i in self.sess.get_outputs()]  # 获取输出节点名称
        self.inname = [i.name for i in self.sess.get_inputs()]  # 获取输入节点名称
        
    def __call__(self,img_input):
        """
        执行模型推理
        参数:
            img_input (array): 输入图像数据
        返回:
            out_put (array): 模型输出结果
        """
        inp = {self.inname[0]:img_input}  # 准备输入数据
        t1=time.time()
        out_put = self.sess.run(self.outname,inp)[0]  # 执行推理
        cost_time = (time.time()-t1)*1000  # 计算推理时间
        return out_put
    
    
def main(args):
    """
    主函数，执行YOLOv10模型推理流程
    参数:
        args: 命令行参数
    """
    input_shape = (640, 640)  # 模型输入尺寸
    conf_thres = 0.25  # 置信度阈值
    img_path = args.imgs  # 输入图像路径
    invoke_nums = args.invoke_nums  # 推理次数
    qnn_path = args.target_model  # QNN模型路径
    
    # 初始化QNN和ONNX模型
    qnn_model1 = qnn_yolov10(qnn_path)
    
    onnx_model_path = 'post_pro_1.onnx'  # ONNX后处理模型路径
    onnx_model = onnx_yolov10(onnx_model_path)

    print("Begin to run qnn...")
    im0 = cv2.imread(img_path)  # 读取图像
    image_shape = im0.shape[:2]  # 获取原始图像尺寸
    img_qnn = preprocess(im0, input_shape)  # 图像预处理
    qnn_out_shape = (1,8400,84)  # QNN模型输出形状
    out1 = qnn_model1(img_qnn,invoke_nums)  # 执行QNN模型推理
    out1 = out1.reshape(*qnn_out_shape)  # 调整输出形状
    out2 = onnx_model(out1)[0]  # 执行ONNX后处理模型推理
       
    boxes, scores, labels = postprocess(out2, conf_thres, image_shape, input_shape)  # 后处理，获取检测结果
    print(f"Detect {len(boxes)} targets")  # 打印检测到的目标数量
    
    colors = Colors()  # 创建颜色生成器
    # 在图像上绘制检测框和标签
    for label, score, box in zip(labels, scores, boxes):
        label_text = f'{classes[label]}: {score:.2f}'  # 标签文本
        color = colors(label, True)  # 获取颜色
        cv2.rectangle(im0, (box[0], box[1]), (box[2], box[3]), color, 2, lineType=cv2.LINE_AA)  # 绘制边界框
        cv2.putText(im0, label_text, (box[0], box[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)  # 绘制标签

    output_image_path = "detected_results.jpg"  # 输出图像路径
    cv2.imwrite(output_image_path, im0)  # 保存结果图像
    print(f"Saved detected result to {output_image_path}")  # 打印保存信息
    


def parser_args():
    """
    解析命令行参数
    返回:
        args: 解析后的命令行参数
    """
    parser = argparse.ArgumentParser(description="Inferrence yolov10 model")
    parser.add_argument('--target_model',type=str,default='yolov10n/cutoff_yolov10n_qcs8550_w8a8.qnn231.ctx.bin',help="Predict images path")
    parser.add_argument('--imgs',type=str,default='bus.jpg',help="Predict images path")
    parser.add_argument('--invoke_nums',type=int,default=100,help="Inference nums")
    args = parser.parse_args()
    return args


if __name__ == "__main__":
    args = parser_args()  # 解析命令行参数
    main(args)  # 执行主函数

...全文