高通手机跑AI系列之——穿衣试装算法

伊利丹~怒风

企业官方账号

2025-06-26 15:37:46

环境准备

手机

测试手机型号：Redmi K60 Pro

处理器：第二代骁龙8移动--8gen2

运行内存：8.0GB ，LPDDR5X-8400，67.0 GB/s

摄像头：前置16MP+后置50MP+8MP+2MP

AI算力：NPU 48Tops INT8 && GPU 1536ALU x 2 x 680MHz = 2.089 TFLOPS

提示：任意手机均可以，性能越好的手机速度越快

软件

APP：AidLux 2.0

系统环境：Ubuntu 20.04.3 LTS

提示：AidLux登录后代码运行更流畅，在代码运行时保持AidLux APP在前台运行，避免代码运行过程中被系统回收进程，另外屏幕保持常亮，一般息屏后一段时间，手机系统会进入休眠状态，如需长驻后台需要给APP权限。

算法Demo

代码功能详解

这段代码实现了一个基于计算机视觉的实时人脸变化的试衣应用。通过摄像头捕获实时视频流，检测人脸并提取关键点，然后将用户选择的目标人脸图像与实时检测到的人脸进行变换，最终实现人脸变换效果。整个系统包含人脸检测、关键点定位、人脸变形与融合、用户界面交互等多个核心模块。

核心模块详细解析

1. 人脸检测模块

# 人脸检测预处理函数
def preprocess_img_pad(img, image_size=128):
    # 图像填充与预处理，适应模型输入要求
    shape = np.r_[img.shape]
    pad_all = (shape.max() - shape[:2]).astype('uint32')
    pad = pad_all // 2
    img_pad_ori = np.pad(
        img,
        ((pad[0], pad_all[0] - pad[0]), (pad[1], pad_all[1] - pad[1]), (0, 0)),
        mode='constant')
    # 颜色空间转换与尺寸调整
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img_pad = np.pad(
        img,
        ((pad[0], pad_all[0] - pad[0]), (pad[1], pad_all[1] - pad[1]), (0, 0)),
        mode='constant')
    img_small = cv2.resize(img_pad, (image_size, image_size))
    img_small = np.expand_dims(img_small, axis=0)
    img_small = (2.0 / 255.0) * img_small - 1.0
    img_small = img_small.astype('float32')
    return img_pad_ori, img_small, pad

人脸检测模块使用 BlazeFace 算法，通过face_detection_front.tflite模型实现：

对输入图像进行填充和尺寸调整，适应模型输入要求 (128x128)
使用锚点 (anchors) 进行边界框预测和解码
通过非极大值抑制 (NMS) 过滤重叠检测框
输出包含人脸位置和关键点的检测结果

2. 人脸关键点检测模块

# 人脸关键点检测预处理
def preprocess_image_for_tflite32(image, model_image_size=192):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (model_image_size, model_image_size))
    image = np.expand_dims(image, axis=0)
    image = (2.0 / 255.0) * image - 1.0
    image = image.astype('float32')
    return image

关键点检测模块使用face_landmark.tflite模型：

对人脸区域图像进行预处理，调整为 192x192 尺寸
模型输出 468 个人脸关键点坐标
这些关键点覆盖了眉毛、眼睛、鼻子、嘴巴和脸部轮廓等区域
关键点用于后续的人脸对齐和变形操作

3. 人脸变换算法模块

# 人脸变换核心函数
def faceswap(points1, points2, img1, img2):
    img1Warped = np.copy(img2)
    # 计算人脸凸包
    hull1 = []
    hull2 = []
    hullIndex = cv2.convexHull(np.array(points2), returnPoints=False)
    for i in range(0, len(hullIndex)):
        hull1.append(points1[int(hullIndex[i])])
        hull2.append(points2[int(hullIndex[i])])
    # 计算Delaunay三角形
    if img2 is None:
        return None
    sizeImg2 = img2.shape
    rect = (0, 0, sizeImg2[1], sizeImg2[0])
    dt = calculateDelaunayTriangles(rect, hull2)
    if len(dt) == 0:
        quit()
    # 对每个三角形应用仿射变换
    for i in range(0, len(dt)):
        t1 = []
        t2 = []
        for j in range(0, 3):
            t1.append(hull1[dt[i][j]])
            t2.append(hull2[dt[i][j]])
        try:
            warpTriangle(img1, img1Warped, t1, t2)
        except:
            return None
    # 计算掩码并进行无缝克隆
    hull8U = []
    for i in range(0, len(hull2)):
        hull8U.append((hull2[i][0], hull2[i][1]))
    mask = np.zeros(img2.shape, dtype=img2.dtype)
    cv2.fillConvexPoly(mask, np.int32(hull8U), (255, 255, 255))
    r = cv2.boundingRect(np.float32([hull2]))
    center = ((r[0] + int(r[2] / 2), r[1] + int(r[3] / 2)))
    output = cv2.seamlessClone(np.uint8(img1Warped), img2, mask, center, cv2.NORMAL_CLONE)
    return output

人脸变换模块实现了完整的人脸变换流程：

使用 Delaunay 三角剖分算法将人脸区域划分为多个三角形
对每个三角形应用仿射变换，实现人脸形状对齐
通过凸包计算确定人脸轮廓区域
使用 OpenCV 的seamlessClone函数进行无缝融合，避免明显边界
整个过程保证了人脸纹理和形状的自然过渡

4. 界面交互模块

# 自定义应用类
class MyApp(App):
    def __init__(self, *args):
        super(MyApp, self).__init__(*args)

    def main(self):
        # 创建垂直布局容器
        main_container = VBox(width=360, height=680, style={'margin': '0px auto'})
        self.aidcam0 = OpencvVideoWidget(self, width=340, height=400)
        # 添加图像选择按钮和保存按钮
        self.lbl = Label('点击图片选择你喜欢的明星脸：')
        bottom_container = HBox(width=360, height=230, style={'margin': '0px auto'})
        self.img1 = Image('/res:' + os.getcwd() + '/' + back_img_path[0], height=80, margin='10px')
        self.img1.onclick.do(self.on_img1_clicked)
        # 保存按钮功能
        self.bt1 = Button('保存合成图片', width=300, height=30, margin='10px')
        self.bt1.onclick.do(self.on_button_pressed1)
        return main_container

界面交互模块提供了用户操作接口：

实时显示摄像头画面和人脸变换结果
提供目标人脸图像选择功能 (支持 4 张预设图像)
实现合成图像保存功能
使用了自定义的 UI 组件 (如 VBox、HBox、Image、Button 等)
支持点击事件处理和用户交互反馈

模型作用解析

1. face_detection_front.tflite

这是一个基于 BlazeFace 算法的人脸检测模型，主要作用：

检测图像中的人脸位置，输出边界框坐标
同时预测 6 个人脸关键点 (眼睛、鼻子、嘴巴等位置)
模型输入尺寸为 128x128，适合在边缘设备上运行
使用锚点机制提高检测精度和速度
输出包含边界框坐标和分类分数

2. face_landmark.tflite

这是人脸关键点检测模型，主要功能：

对检测到的人脸区域，预测 468 个精确关键点
关键点覆盖了面部所有重要特征点
模型输入尺寸为 192x192，输出维度为 1404x4
这些关键点是人脸变换算法的基础，用于形状对齐和变形
模型在 GPU 上运行以提高实时性

aidlite 推理引擎介绍

基本功能

aidlite 是一个专为高通边缘设备设计的轻量级推理引擎，具有以下核心功能：

支持多种深度学习框架模型转换和部署 (TensorFlow Lite, ONNX 等)
提供统一的 API 接口，简化模型推理流程
支持 CPU 和 GPU 加速，根据设备资源自动选择最佳计算方式
优化内存使用，适合在资源受限的设备上运行
提供模型编译和优化工具，提高推理效率

架构特点

# aidlite模型加载与配置示例
model = aidlite.Model.create_instance(model_path)
config = aidlite.Config.create_instance()
config.implement_type = aidlite.ImplementType.TYPE_FAST
config.framework_type = aidlite.FrameworkType.TYPE_TFLITE
config.accelerate_type = aidlite.AccelerateType.TYPE_CPU
config.number_of_threads = 4
interpreter = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model, config)

aidlite 的架构设计具有以下特点：

模块化设计，支持插件式扩展
硬件抽象层，屏蔽不同设备的差异
高效的内存管理机制，减少内存拷贝
支持多线程并行计算，充分利用多核 CPU
针对移动设备和嵌入式设备进行了专门优化

性能优势

aidlite 在边缘设备上的性能优势：

低延迟：针对实时应用场景优化，响应时间短
高效率：计算资源利用率高，能耗低
轻量级：库文件体积小，占用系统资源少
跨平台：支持多种嵌入式系统和硬件平台
灵活配置：可根据设备性能动态调整计算参数

代码应用场景分析

主要应用场景

娱乐与社交媒体
- 实时人脸变换滤镜，用于短视频和直播
- 社交媒体特效，增加用户互动性
教育与演示
- 计算机视觉原理教学演示
- 机器学习模型应用实例展示
- 人脸处理技术科普

技术特点与限制

技术优势
- 实时性：支持摄像头实时视频流处理
- 易用性：提供图形界面，操作简单
- 灵活性：支持自定义目标人脸图像
- 轻量级：可在嵌入式设备上运行
应用限制
- 对人脸姿态和表情变化的鲁棒性有限
- 复杂光照条件下效果可能下降
- 多人脸场景下需要进一步优化
- 高精度需求场景下可能需要更复杂的模型

总结

这段代码实现了一个完整的实时人脸变化试衣Demo，结合了人脸检测、关键点定位和图像变形等多种计算机视觉技术。系统使用 aidlite 推理引擎在边缘设备上高效运行深度学习模型，实现了实时的人脸分析和图像合成。该应用具有广泛的娱乐和实用价值，同时也展示了轻量级深度学习在边缘设备上的应用潜力。通过进一步优化模型和算法，可以将该系统应用于更多场景。

示例代码

import cv2
import math
import sys
import numpy as np
import os
import subprocess
import time
from cvs import *
import aidlite

# 背景图片路径列表,图片可以替换
back_img_path = ('models/Biden.jpeg', 'models/zh.jpeg', 'models/zh1.jpeg', 'models/kt1.jpg')

# 读取第一张背景图片
faceimg = cv2.imread(back_img_path[0])
mod = -1
bfirstframe = True
saveimg = faceimg

# 从文件中读取关键点
def readPoints(path):
    # 创建一个点的数组
    points = []
    # 读取文件中的点
    with open(path) as file:
        for line in file:
            x, y = line.split()
            points.append((int(x), int(y)))
    return points

# 应用仿射变换
def applyAffineTransform(src, srcTri, dstTri, size):
    # 计算仿射变换矩阵
    warpMat = cv2.getAffineTransform(np.float32(srcTri), np.float32(dstTri))
    # 应用仿射变换到源图像
    dst = cv2.warpAffine(src, warpMat, (size[0], size[1]), None, flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REFLECT_101)
    return dst

# 检查点是否在矩形内
def rectContains(rect, point):
    if point[0] < rect[0]:
        return False
    elif point[1] < rect[1]:
        return False
    elif point[0] > rect[0] + rect[2]:
        return False
    elif point[1] > rect[1] + rect[3]:
        return False
    return True

# 计算Delaunay三角形
def calculateDelaunayTriangles(rect, points):
    # 创建Subdiv2D对象
    subdiv = cv2.Subdiv2D(rect)
    # 将点插入到Subdiv2D对象中
    for p in points:
        subdiv.insert(p)
    # 获取三角形列表
    triangleList = subdiv.getTriangleList()
    delaunayTri = []
    pt = []
    for t in triangleList:
        pt.append((t[0], t[1]))
        pt.append((t[2], t[3]))
        pt.append((t[4], t[5]))
        pt1 = (t[0], t[1])
        pt2 = (t[2], t[3])
        pt3 = (t[4], t[5])
        # 检查三角形的三个点是否都在矩形内
        if rectContains(rect, pt1) and rectContains(rect, pt2) and rectContains(rect, pt3):
            ind = []
            # 通过坐标获取人脸关键点的索引
            for j in range(0, 3):
                for k in range(0, len(points)):
                    if abs(pt[j][0] - points[k][0]) < 1.0 and abs(pt[j][1] - points[k][1]) < 1.0:
                        ind.append(k)
            # 如果索引列表长度为3，则将其添加到Delaunay三角形列表中
            if len(ind) == 3:
                delaunayTri.append((ind[0], ind[1], ind[2]))
        pt = []
    return delaunayTri

# 对三角形区域进行变形和融合
def warpTriangle(img1, img2, t1, t2):
    # 找到每个三角形的边界矩形
    r1 = cv2.boundingRect(np.float32([t1]))
    r2 = cv2.boundingRect(np.float32([t2]))
    # 偏移点以矩形左上角为原点
    t1Rect = []
    t2Rect = []
    t2RectInt = []
    for i in range(0, 3):
        t1Rect.append(((t1[i][0] - r1[0]), (t1[i][1] - r1[1])))
        t2Rect.append(((t2[i][0] - r2[0]), (t2[i][1] - r2[1])))
        t2RectInt.append(((t2[i][0] - r2[0]), (t2[i][1] - r2[1])))
    # 创建掩码
    mask = np.zeros((r2[3], r2[2], 3), dtype=np.float32)
    cv2.fillConvexPoly(mask, np.int32(t2RectInt), (1.0, 1.0, 1.0), 16, 0)
    # 提取源图像的矩形区域
    img1Rect = img1[r1[1]:r1[1] + r1[3], r1[0]:r1[0] + r1[2]]
    size = (r2[2], r2[3])
    # 对源图像的矩形区域进行仿射变换
    img2Rect = applyAffineTransform(img1Rect, t1Rect, t2Rect, size)
    img2Rect = img2Rect * mask
    # 将变换后的三角形区域复制到目标图像中
    img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] = img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] * ((1.0, 1.0, 1.0) - mask)
    img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] = img2[r2[1]:r2[1] + r2[3], r2[0]:r2[0] + r2[2]] + img2Rect

# 人脸变换函数
def faceswap(points1, points2, img1, img2):
    img1Warped = np.copy(img2)
    # 找到凸包
    hull1 = []
    hull2 = []
    hullIndex = cv2.convexHull(np.array(points2), returnPoints=False)
    for i in range(0, len(hullIndex)):
        hull1.append(points1[int(hullIndex[i])])
        hull2.append(points2[int(hullIndex[i])])
    # 计算凸包点的Delaunay三角形
    if img2 is None:
        return None
    sizeImg2 = img2.shape
    rect = (0, 0, sizeImg2[1], sizeImg2[0])
    dt = calculateDelaunayTriangles(rect, hull2)
    if len(dt) == 0:
        quit()
    # 对Delaunay三角形应用仿射变换
    for i in range(0, len(dt)):
        t1 = []
        t2 = []
        for j in range(0, 3):
            t1.append(hull1[dt[i][j]])
            t2.append(hull2[dt[i][j]])
        try:
            warpTriangle(img1, img1Warped, t1, t2)
        except:
            return None
    # 计算掩码
    hull8U = []
    for i in range(0, len(hull2)):
        hull8U.append((hull2[i][0], hull2[i][1]))
    mask = np.zeros(img2.shape, dtype=img2.dtype)
    cv2.fillConvexPoly(mask, np.int32(hull8U), (255, 255, 255))
    r = cv2.boundingRect(np.float32([hull2]))
    center = ((r[0] + int(r[2] / 2), r[1] + int(r[3] / 2)))
    # 无缝克隆
    output = cv2.seamlessClone(np.uint8(img1Warped), img2, mask, center, cv2.NORMAL_CLONE)
    return output

# 对图像进行预处理，用于TFLite模型
def preprocess_image_for_tflite32(image, model_image_size=192):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (model_image_size, model_image_size))
    image = np.expand_dims(image, axis=0)
    image = (2.0 / 255.0) * image - 1.0
    image = image.astype('float32')
    return image

# 对图像进行填充和预处理
def preprocess_img_pad(img, image_size=128):
    shape = np.r_[img.shape]
    pad_all = (shape.max() - shape[:2]).astype('uint32')
    pad = pad_all // 2
    img_pad_ori = np.pad(
        img,
        ((pad[0], pad_all[0] - pad[0]), (pad[1], pad_all[1] - pad[1]), (0, 0)),
        mode='constant')
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img_pad = np.pad(
        img,
        ((pad[0], pad_all[0] - pad[0]), (pad[1], pad_all[1] - pad[1]), (0, 0)),
        mode='constant')
    img_small = cv2.resize(img_pad, (image_size, image_size))
    img_small = np.expand_dims(img_small, axis=0)
    img_small = (2.0 / 255.0) * img_small - 1.0
    img_small = img_small.astype('float32')
    return img_pad_ori, img_small, pad

# 绘制检测到的人脸框
def plot_detections(img, detections, with_keypoints=True):
    output_img = img
    print(img.shape)
    x_min = 0
    x_max = 0
    y_min = 0
    y_max = 0
    print("找到 %d 个人脸" % len(detections))
    for i in range(len(detections)):
        ymin = detections[i][0] * img.shape[0]
        xmin = detections[i][1] * img.shape[1]
        ymax = detections[i][2] * img.shape[0]
        xmax = detections[i][3] * img.shape[1]
        w = int(xmax - xmin)
        h = int(ymax - ymin)
        h = max(w, h)
        h = h * 1.5
        x = (xmin + xmax) / 2.
        y = (ymin + ymax) / 2.
        xmin = x - h / 2.
        xmax = x + h / 2.
        ymin = y - h / 2. - 0.08 * h
        ymax = y + h / 2. - 0.08 * h
        x_min = int(xmin)
        y_min = int(ymin)
        x_max = int(xmax)
        y_max = int(ymax)
        p1 = (int(xmin), int(ymin))
        p2 = (int(xmax), int(ymax))
        cv2.rectangle(output_img, p1, p2, (0, 255, 255), 2, 1)
    return x_min, y_min, x_max, y_max

# 绘制人脸网格
def draw_mesh(image, mesh, mark_size=2, line_width=1):
    image_size = image.shape[0]
    mesh = mesh * image_size
    for point in mesh:
        cv2.circle(image, (point[0], point[1]),
                   mark_size, (0, 255, 128), -1)
    # 绘制眼睛轮廓
    left_eye_contour = np.array([mesh[33][0:2],
                                 mesh[7][0:2],
                                 mesh[163][0:2],
                                 mesh[144][0:2],
                                 mesh[145][0:2],
                                 mesh[153][0:2],
                                 mesh[154][0:2],
                                 mesh[155][0:2],
                                 mesh[133][0:2],
                                 mesh[173][0:2],
                                 mesh[157][0:2],
                                 mesh[158][0:2],
                                 mesh[159][0:2],
                                 mesh[160][0:2],
                                 mesh[161][0:2],
                                 mesh[246][0:2], ]).astype(np.int32)
    right_eye_contour = np.array([mesh[263][0:2],
                                  mesh[249][0:2],
                                  mesh[390][0:2],
                                  mesh[373][0:2],
                                  mesh[374][0:2],
                                  mesh[380][0:2],
                                  mesh[381][0:2],
                                  mesh[382][0:2],
                                  mesh[362][0:2],
                                  mesh[398][0:2],
                                  mesh[384][0:2],
                                  mesh[385][0:2],
                                  mesh[386][0:2],
                                  mesh[387][0:2],
                                  mesh[388][0:2],
                                  mesh[466][0:2]]).astype(np.int32)
    cv2.polylines(image, [left_eye_contour, right_eye_contour], False,
                  (255, 255, 255), line_width, cv2.LINE_AA)

# 获取人脸关键点
def getkeypoint(image, mesh, landmark_point):
    image_size = image.shape[0]
    mesh = mesh * image_size
    for point in mesh:
        landmark_point.append((point[0], point[1]))
    return image

# 绘制人脸关键点和轮廓
def draw_landmarks(image, mesh, landmark_point):
    image_size = image.shape[0]
    mesh = mesh * image_size
    for point in mesh:
        landmark_point.append((point[0], point[1]))
        cv2.circle(image, (point[0], point[1]), 2, (255, 255, 0), -1)
    if len(landmark_point) > 0:
        # 绘制左眉毛
        cv2.line(image, landmark_point[55], landmark_point[65], (0, 0, 255), 2, -3)
        cv2.line(image, landmark_point[65], landmark_point[52], (0, 0, 255), 2, -3)
        cv2.line(image, landmark_point[52], landmark_point[53], (0, 0, 255), 2, -3)
        cv2.line(image, landmark_point[53], landmark_point[46], (0, 0, 255), 2, -3)
        # 绘制右眉毛
        cv2.line(image, landmark_point[285], landmark_point[295], (0, 0, 255), 2)
        cv2.line(image, landmark_point[295], landmark_point[282], (0, 0, 255), 2)
        cv2.line(image, landmark_point[282], landmark_point[283], (0, 0, 255), 2)
        cv2.line(image, landmark_point[283], landmark_point[276], (0, 0, 255), 2)
        # 绘制左眼睛
        cv2.line(image, landmark_point[133], landmark_point[173], (0, 0, 255), 2)
        cv2.line(image, landmark_point[173], landmark_point[157], (0, 0, 255), 2)
        cv2.line(image, landmark_point[157], landmark_point[158], (0, 0, 255), 2)
        cv2.line(image, landmark_point[158], landmark_point[159], (0, 0, 255), 2)
        cv2.line(image, landmark_point[159], landmark_point[160], (0, 0, 255), 2)
        cv2.line(image, landmark_point[160], landmark_point[161], (0, 0, 255), 2)
        cv2.line(image, landmark_point[161], landmark_point[246], (0, 0, 255), 2)
        cv2.line(image, landmark_point[246], landmark_point[163], (0, 0, 255), 2)
        cv2.line(image, landmark_point[163], landmark_point[144], (0, 0, 255), 2)
        cv2.line(image, landmark_point[144], landmark_point[145], (0, 0, 255), 2)
        cv2.line(image, landmark_point[145], landmark_point[153], (0, 0, 255), 2)
        cv2.line(image, landmark_point[153], landmark_point[154], (0, 0, 255), 2)
        cv2.line(image, landmark_point[154], landmark_point[155], (0, 0, 255), 2)
        cv2.line(image, landmark_point[155], landmark_point[133], (0, 0, 255), 2)
        # 绘制右眼睛
        cv2.line(image, landmark_point[362], landmark_point[398], (0, 0, 255), 2)
        cv2.line(image, landmark_point[398], landmark_point[384], (0, 0, 255), 2)
        cv2.line(image, landmark_point[384], landmark_point[385], (0, 0, 255), 2)
        cv2.line(image, landmark_point[385], landmark_point[386], (0, 0, 255), 2)
        cv2.line(image, landmark_point[386], landmark_point[387], (0, 0, 255), 2)
        cv2.line(image, landmark_point[387], landmark_point[388], (0, 0, 255), 2)
        cv2.line(image, landmark_point[388], landmark_point[466], (0, 0, 255), 2)
        cv2.line(image, landmark_point[466], landmark_point[390], (0, 0, 255), 2)
        cv2.line(image, landmark_point[390], landmark_point[373], (0, 0, 255), 2)
        cv2.line(image, landmark_point[373], landmark_point[374], (0, 0, 255), 2)
        cv2.line(image, landmark_point[374], landmark_point[380], (0, 0, 255), 2)
        cv2.line(image, landmark_point[380], landmark_point[381], (0, 0, 255), 2)
        cv2.line(image, landmark_point[381], landmark_point[382], (0, 0, 255), 2)
        cv2.line(image, landmark_point[382], landmark_point[362], (0, 0, 255), 2)
        # 绘制嘴巴
        cv2.line(image, landmark_point[308], landmark_point[415], (0, 0, 255), 2)
        cv2.line(image, landmark_point[415], landmark_point[310], (0, 0, 255), 2)
        cv2.line(image, landmark_point[310], landmark_point[311], (0, 0, 255), 2)
        cv2.line(image, landmark_point[311], landmark_point[312], (0, 0, 255), 2)
        cv2.line(image, landmark_point[312], landmark_point[13], (0, 0, 255), 2)
        cv2.line(image, landmark_point[13], landmark_point[82], (0, 0, 255), 2)
        cv2.line(image, landmark_point[82], landmark_point[81], (0, 0, 255), 2)
        cv2.line(image, landmark_point[81], landmark_point[80], (0, 0, 255), 2)
        cv2.line(image, landmark_point[80], landmark_point[191], (0, 0, 255), 2)
        cv2.line(image, landmark_point[191], landmark_point[78], (0, 0, 255), 2)
        cv2.line(image, landmark_point[78], landmark_point[95], (0, 0, 255), 2)
        cv2.line(image, landmark_point[95], landmark_point[88], (0, 0, 255), 2)
        cv2.line(image, landmark_point[88], landmark_point[178], (0, 0, 255), 2)
        cv2.line(image, landmark_point[178], landmark_point[87], (0, 0, 255), 2)
        cv2.line(image, landmark_point[87], landmark_point[14], (0, 0, 255), 2)
        cv2.line(image, landmark_point[14], landmark_point[317], (0, 0, 255), 2)
        cv2.line(image, landmark_point[317], landmark_point[402], (0, 0, 255), 2)
        cv2.line(image, landmark_point[402], landmark_point[318], (0, 0, 255), 2)
        cv2.line(image, landmark_point[318], landmark_point[324], (0, 0, 255), 2)
        cv2.line(image, landmark_point[324], landmark_point[308], (0, 0, 255), 2)
    return image

# 自定义应用类
class MyApp(App):
    def __init__(self, *args):
        super(MyApp, self).__init__(*args)

    def idle(self):
        self.aidcam0.update()

    def main(self):
        # 创建垂直布局容器
        main_container = VBox(width=360, height=680, style={'margin': '0px auto'})
        self.aidcam0 = OpencvVideoWidget(self, width=340, height=400)
        self.aidcam0.style['margin'] = '10px'
        i = 0
        exec("self.aidcam%(i)s = OpencvVideoWidget(self)" % {'i': i})
        exec("self.aidcam%(i)s.identifier = 'aidcam%(i)s'" % {'i': i})
        eval("main_container.append(self.aidcam%(i)s)" % {'i': i})
        main_container.append(self.aidcam0)
        self.lbl = Label('点击图片选择你喜欢的明星脸：')
        main_container.append(self.lbl)
        # 创建水平布局容器
        bottom_container = HBox(width=360, height=230, style={'margin': '0px auto'})
        self.img1 = Image('/res:' + os.getcwd() + '/' + back_img_path[0], height=80, margin='10px')
        self.img1.onclick.do(self.on_img1_clicked)
        bottom_container.append(self.img1)
        self.img2 = Image('/res:' + os.getcwd() + '/' + back_img_path[1], height=80, margin='10px')
        self.img2.onclick.do(self.on_img2_clicked)
        bottom_container.append(self.img2)
        self.img4 = Image('/res:' + os.getcwd() + '/' + back_img_path[3], height=80, margin='10px')
        self.img4.onclick.do(self.on_img4_clicked)
        bottom_container.append(self.img4)
        self.img3 = Image('/res:' + os.getcwd() + '/' + back_img_path[2], height=80, margin='10px')
        self.img3.onclick.do(self.on_img3_clicked)
        bottom_container.append(self.img3)
        self.bt1 = Button('保存合成图片', width=300, height=30, margin='10px')
        self.bt1.onclick.do(self.on_button_pressed1)
        main_container.append(bottom_container)
        main_container.append(self.bt1)
        return main_container

    def on_img1_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[0])
        faceimg = bgnd
        global mod
        mod = 0
        print('mod', mod)

    def on_img2_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[1])
        faceimg = bgnd
        global mod
        mod = 1
        print('mod', mod)

    def on_img3_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[2])
        faceimg = bgnd
        global mod
        mod = 2
        print('mod', mod)

    def on_img4_clicked(self, widget):
        global faceimg
        bgnd = cv2.imread(back_img_path[3])
        faceimg = bgnd
        global mod
        mod = 3
        print('mod', mod)

    def on_button_pressed1(self, widget):
        cv2.imwrite('result.jpg', saveimg)
        aidlite.makeToast("保存合成图片result.jpg成功!")

# 获取摄像头ID
def get_cap_id():
    try:
        # 构造命令，使用awk处理输出
        cmd = "ls -l /sys/class/video4linux | awk -F ' -> ' '/usb/{sub(/.*video/, \"\", $2); print $2}'"
        result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
        output = result.stdout.strip().split()
        # 转换所有捕获的编号为整数，找出最小值
        video_numbers = list(map(int, output))
        if video_numbers:
            return min(video_numbers)
        else:
            return None
    except Exception as e:
        print(f"发生错误: {e}")
        return None

# 主处理函数
def process():
    cvs.setCustomUI()
    # 加载人脸检测模型
    inShape = [[1, 128, 128, 3]]
    outShape = [[1, 896, 16], [1, 896, 1]]
    model_path = "models/face_detection_front.tflite"
    model = aidlite.Model.create_instance(model_path)
    if model is None:
        print("创建face_detection_front模型失败!")
    model.set_model_properties(inShape, aidlite.DataType.TYPE_FLOAT32, outShape,
                               aidlite.DataType.TYPE_FLOAT32)
    config = aidlite.Config.create_instance()
    config.implement_type = aidlite.ImplementType.TYPE_FAST
    config.framework_type = aidlite.FrameworkType.TYPE_TFLITE
    config.accelerate_type = aidlite.AccelerateType.TYPE_CPU
    config.number_of_threads = 4
    fast_interpreter = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model, config)
    if fast_interpreter is None:
        print("face_detection_front模型build_interpretper_from_model_and_config失败!")
    result = fast_interpreter.init()
    if result != 0:
        print("face_detection_front模型解释器初始化失败!")
    result = fast_interpreter.load_model()
    if result != 0:
        print("face_detection_front模型解释器加载模型失败!")
    print("face_detection_front模型加载成功!")
    # 加载人脸关键点检测模型
    model_path1 = "models/face_landmark.tflite"
    inShape1 = [[1 * 192 * 192 * 3]]
    outShape1 = [[1 * 1404 * 4], [1 * 4]]
    model1 = aidlite.Model.create_instance(model_path1)
    if model1 is None:
        print("创建face_landmark模型失败!")
    model1.set_model_properties(inShape1, aidlite.DataType.TYPE_FLOAT32, outShape1,
                                aidlite.DataType.TYPE_FLOAT32)
    config1 = aidlite.Config.create_instance()
    config1.implement_type = aidlite.ImplementType.TYPE_FAST
    config1.framework_type = aidlite.FrameworkType.TYPE_TFLITE
    config1.accelerate_type = aidlite.AccelerateType.TYPE_GPU
    config1.number_of_threads = 4
    fast_interpreter1 = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model1, config1)
    if fast_interpreter1 is None:
        print("face_landmark模型build_interpretper_from_model_and_config失败!")
    result = fast_interpreter1.init()
    if result != 0:
        print("face_landmark模型解释器初始化失败!")
    result = fast_interpreter1.load_model()
    if result != 0:
        print("face_landmark模型解释器加载模型失败!")
    print("face_landmark模型加载成功!")
    # 加载锚点数据
    anchors = np.load('models/anchors.npy').astype(np.float32)
    # 0-后置，1-前置
    camid = 1
    capId = get_cap_id()
    if capId is None:
        print("使用MIPI摄像头")
        camid = camid
    else:
        print("使用USB摄像头")
        camid = -1
    cap = cvs.VideoCapture(camid)
    bFace = False
    x_min, y_min, x_max, y_max = (0, 0, 0, 0)
    fface = 0.0
    bfirstframe = True
    facepath = "models/Biden.jpeg"
    global faceimg
    faceimg = cv2.imread(facepath)
    faceimg = cv2.resize(faceimg, (480, 640))
    roi_orifirst = faceimg
    padfaceimg = faceimg
    f_x_min, f_y_min, f_x_max, f_y_max = (0, 0, 0, 0)
    fpoints = []
    spoints = []
    global mod
    mod = -1
    temp = faceimg
    while True:
        frame = cvs.read()
        if frame is None:
            continue
        if camid == 1:
            frame = cv2.flip(frame, 1)
        if bfirstframe or mod > -1:
            frame = cv2.resize(faceimg, (480, 640))
            bfirstframe = True
            roi_orifirst = faceimg
            padfaceimg = faceimg
            f_x_min, f_y_min, f_x_max, f_y_max = (0, 0, 0, 0)
            fpoints = []
            spoints = []
            bFace = False
            fface = 0.0
            x_min, y_min, x_max, y_max = (0, 0, 0, 0)
        start_time = time.time()
        img_pad, img, pad = preprocess_img_pad(frame, 128)
        if bFace == False:
            result = fast_interpreter.set_input_tensor(0, img.data)
            if result != 0:
                print("face_detection_front模型解释器set_input_tensor()失败")
            result = fast_interpreter.invoke()
            if result != 0:
                print("face_detection_front模型解释器invoke()失败")
            raw_boxes = fast_interpreter.get_output_tensor(0)
            if raw_boxes is None:
                print("示例: face_detection_front模型解释器->get_output_tensor(0)失败!")
            classificators = fast_interpreter.get_output_tensor(1)
            if classificators is None:
                print("示例: face_detection_front模型解释器->get_output_tensor(1)失败!")
            detections = blazeface(raw_boxes, classificators, anchors)[0]
            if len(detections) > 0:
                bFace = True
        if bFace:
            for i in range(len(detections)):
                ymin = detections[i][0] * img_pad.shape[0]
                xmin = detections[i][1] * img_pad.shape[1]
                ymax = detections[i][2] * img_pad.shape[0]
                xmax = detections[i][3] * img_pad.shape[1]
                w = int(xmax - xmin)
                h = int(ymax - ymin)
                h = max(w, h)
                h = h * 1.5
                x = (xmin + xmax) / 2.
                y = (ymin + ymax) / 2.
                xmin = x - h / 2.
                xmax = x + h / 2.
                ymin = y - h / 2. - 0.08 * h
                ymax = y + h / 2. - 0.08 * h
                x_min = int(xmin)
                y_min = int(ymin)
                x_max = int(xmax)
                y_max = int(ymax)
                x_min = max(0, x_min)
                y_min = max(0, y_min)
                x_max = min(img_pad.shape[1], x_max)
                y_max = min(img_pad.shape[0], y_max)
                roi_ori = img_pad[y_min:y_max, x_min:x_max]
                roi = preprocess_image_for_tflite32(roi_ori, 192)
                result = fast_interpreter1.set_input_tensor(0, roi.data)
                if result != 0:
                    print("face_landmark模型解释器set_input_tensor()失败")
                result = fast_interpreter1.invoke()
                if result != 0:
                    print("face_landmark模型解释器invoke()失败")
                mesh = fast_interpreter1.get_output_tensor(0)
                if mesh is None:
                    print("示例: face_landmark模型解释器->get_output_tensor(0)失败!")
                stride8 = fast_interpreter1.get_output_tensor(1)
                if stride8 is None:
                    print("示例: face_landmark模型解释器->get_output_tensor(1)失败!")
                print(f"stride8.shape: {stride8.shape}")
                ffacetmp = stride8[0]
                bFace = False
                spoints = []
                mesh = mesh.reshape(468, 3) / 192
                if bfirstframe:
                    getkeypoint(roi_ori, mesh, fpoints)
                    roi_orifirst = roi_ori.copy()
                    temp = roi_orifirst.copy()
                    bfirstframe = False
                    padfaceimg = img_pad.copy()
                    f_x_min, f_y_min, f_x_max, f_y_max = (x_min, y_min, x_max, y_max)
                    mod = -1
                else:
                    getkeypoint(roi_ori, mesh, spoints)
                    roi_orifirst = faceswap(spoints, fpoints, roi_ori, temp)
                    if roi_orifirst is None:
                        continue
                    f_img_pad = padfaceimg.copy()
                    x_min, y_min, x_max, y_max = (f_x_min, f_y_min, f_x_max, f_y_max)
                    f_img_pad[y_min:y_max, x_min:x_max] = roi_orifirst
                    x, y = f_img_pad.shape[0] / 2, f_img_pad.shape[1] / 2
                    img_pad = f_img_pad
                shape = frame.shape
                global saveimg
                saveimg = img_pad[max(0, int(y - shape[0] / 2)):int(y + shape[0] / 2),
                          max(0, int(x - shape[1] / 2)):int(x + shape[1] / 2)]
        t = (time.time() - start_time)
        lbs = 'Fps: ' + str(int(100 / t) / 100.) + " ~~ Time:" + str(t * 1000) + "ms"
        cvs.setLbs(lbs)
        cvs.imshow(saveimg)
        time.sleep(1)

# BlazeFace人脸检测类
class BlazeFace():
    def __init__(self):
        # 类别数量
        self.num_classes = 1
        # 锚点数量
        self.num_anchors = 896
        # 坐标数量
        self.num_coords = 16
        # 分数裁剪阈值
        self.score_clipping_thresh = 100.0
        # 缩放因子
        self.x_scale = 128.0
        self.y_scale = 128.0
        self.h_scale = 128.0
        self.w_scale = 128.0
        # 最小分数阈值
        self.min_score_thresh = 0.75
        # 最小抑制阈值
        self.min_suppression_threshold = 0.3

    # Sigmoid函数
    def sigmoid(self, inX):
        if inX >= 0:
            return 1.0 / (1 + np.exp(-inX))
        else:
            return np.exp(inX) / (1 + np.exp(inX))

    # 将原始张量转换为检测结果
    def tensors_to_detections(self, raw_box_tensor, raw_score_tensor, anchors):
        assert len(raw_box_tensor.shape) == 3
        assert raw_box_tensor.shape[1] == self.num_anchors
        assert raw_box_tensor.shape[2] == self.num_coords
        assert len(raw_box_tensor.shape) == 3
        assert raw_score_tensor.shape[1] == self.num_anchors
        assert raw_score_tensor.shape[2] == self.num_classes
        assert raw_box_tensor.shape[0] == raw_score_tensor.shape[0]
        # 解码边界框
        detection_boxes = self._decode_boxes(raw_box_tensor, anchors)
        thresh = self.score_clipping_thresh
        raw_score_tensor = raw_score_tensor.clip(-thresh, thresh)
        # 计算检测分数
        detection_scores = 1 / (1 + np.exp(- raw_score_tensor)).squeeze(axis=-1)
        # 过滤掉分数低于阈值的检测结果
        mask = detection_scores >= self.min_score_thresh
        output_detections = []
        for i in range(raw_box_tensor.shape[0]):
            boxes = detection_boxes[i, mask[i]]
            scores = np.expand_dims(detection_scores[i, mask[i]], axis=-1)
            output_detections.append(np.concatenate((boxes, scores), axis=-1))
        return output_detections

    # 解码边界框
    def _decode_boxes(self, raw_boxes, anchors):
        boxes = np.zeros(raw_boxes.shape)
        x_center = raw_boxes[..., 0] / self.x_scale * anchors[:, 2] + anchors[:, 0]
        y_center = raw_boxes[..., 1] / self.y_scale * anchors[:, 3] + anchors[:, 1]
        w = raw_boxes[..., 2] / self.w_scale * anchors[:, 2]
        h = raw_boxes[..., 3] / self.h_scale * anchors[:, 3]
        boxes[..., 0] = y_center - h / 2.  # ymin
        boxes[..., 1] = x_center - w / 2.  # xmin
        boxes[..., 2] = y_center + h / 2.  # ymax
        boxes[..., 3] = x_center + w / 2.  # xmax
        for k in range(6):
            offset = 4 + k * 2
            keypoint_x = raw_boxes[..., offset] / self.x_scale * anchors[:, 2] + anchors[:, 0]
            keypoint_y = raw_boxes[..., offset + 1] / self.y_scale * anchors[:, 3] + anchors[:, 1]
            boxes[..., offset] = keypoint_x
            boxes[..., offset + 1] = keypoint_y
        return boxes

    # 加权非极大值抑制
    def weighted_non_max_suppression(self, detections):
        if len(detections) == 0: return []
        output_detections = []
        # 按分数从高到低排序
        remaining = np.argsort(-detections[:, 16])
        while len(remaining) > 0:
            detection = detections[remaining[0]]
            first_box = detection[:4]
            other_boxes = detections[remaining, :4]
            # 计算IoU
            ious = overlap_similarity(first_box, other_boxes)
            mask = ious > self.min_suppression_threshold
            overlapping = remaining[mask]
            remaining = remaining[~mask]
            weighted_detection = detection.copy()
            if len(overlapping) > 1:
                coordinates = detections[overlapping, :16]
                scores = detections[overlapping, 16:17]
                total_score = scores.sum()
                weighted = (coordinates * scores).sum(axis=0) / total_score
                weighted_detection[:16] = weighted
                weighted_detection[16] = total_score / len(overlapping)
            output_detections.append(weighted_detection)
        return output_detections

# BlazeFace人脸检测函数
def blazeface(raw_output_a, raw_output_b, anchors):
    if raw_output_a.size == 896:
        raw_score_tensor = raw_output_a
        raw_box_tensor = raw_output_b
    else:
        raw_score_tensor = raw_output_b
        raw_box_tensor = raw_output_a
    assert (raw_score_tensor.size == 896)
    assert (raw_box_tensor.size == 896 * 16)
    raw_score_tensor = raw_score_tensor.reshape(1, 896, 1)
    raw_box_tensor = raw_box_tensor.reshape(1, 896, 16)
    net = BlazeFace()
    # 后处理原始预测结果
    detections = net.tensors_to_detections(raw_box_tensor, raw_score_tensor, anchors)
    # 非极大值抑制
    filtered_detections = []
    for i in range(len(detections)):
        faces = net.weighted_non_max_suppression(detections[i])
        if len(faces) > 0:
            faces = np.stack(faces)
        filtered_detections.append(faces)
    return filtered_detections

# 将检测结果转换为原始图像坐标
def convert_to_orig_points(results, orig_dim, letter_dim):
    inter_scale = min(letter_dim / orig_dim[0], letter_dim / orig_dim[1])
    inter_h, inter_w = int(inter_scale * orig_dim[0]), int(inter_scale * orig_dim[1])
    offset_x, offset_y = (letter_dim - inter_w) / 2.0 / letter_dim, (letter_dim - inter_h) / 2.0 / letter_dim
    scale_x, scale_y = letter_dim / inter_w, letter_dim / inter_h
    results[:, 0:2] = (results[:, 0:2] - [offset_x, offset_y]) * [scale_x, scale_y]
    results[:, 2:4] = results[:, 2:4] * [scale_x, scale_y]
    results[:, 4:16:2] = (results[:, 4:16:2] - offset_x) * scale_x
    results[:, 5:17:2] = (results[:, 5:17:2] - offset_y) * scale_y
    results[:, 0:16:2] *= orig_dim[1]
    results[:, 1:17:2] *= orig_dim[0]
    return results.astype(np.int32)

# 计算IoU
def overlap_similarity(box, other_boxes):
    def union(A, B):
        x1, y1, x2, y2 = A
        a = (x2 - x1) * (y2 - y1)
        x1, y1, x2, y2 = B
        b = (x2 - x1) * (y2 - y1)
        ret = a + b - intersect(A, B)
        return ret

    def intersect(A, B):
        x1 = max(A[0], B[0])
        y1 = max(A[1], B[1])
        x2 = min(A[2], B[2])
        y2 = min(A[3], B[3])
        return (x2 - x1) * (y2 - y1)

    ret = np.array([max(0, intersect(box, b) / union(box, b)) for b in other_boxes])
    return ret

if __name__ == '__main__':
    initcv(startcv, MyApp)
    process()

效果样图：