计算机视觉（CV）学习大纲

人工智能 2025-12-25 14:35:16

计算机视觉（CV）学习大纲（Excel 适配版，含最新技术与全量资源）

学习阶段	核心模块	知识点（含 2024-2025 最新技术）	学习网站（含有效网址）	GitHub 项目地址	学习视频网站（含有效网址）
基础奠基阶段（2-3 个月）	数学与编程基础	1. 核心数学：线性代数（矩阵运算、特征值分解、图像变换）、概率论（贝叶斯定理、高斯分布）、微积分（梯度下降）2. 核心作用：支撑 CV 算法推导与优化	1. 3Blue1Brown 官网（https://www.3blue1brown.com/）2. Khan Academy（https://www.khanacademy.org/）3. MIT OpenCourseWare（https://ocw.mit.edu/）	1. https://github.com/kenjihiranabe/The-Art-of-Linear-Algebra2. https://github.com/ashishpatel26/Mathematics-for-Machine-Learning	1. B 站《线性代数的本质》（https://www.bilibili.com/video/BV1ys411472E/）2. MIT 线性代数公开课（https://www.youtube.com/playlist?list=PLUl4u3cNGP63gFHB6xb-kVBiQHYe_4hSi）3. 中国大学 MOOC 概率统计（）
		1. Python 核心：语法基础、函数编程、面向对象、文件操作2. CV 适配：高效处理图像数据、调用算法库	1. Python 官方文档（https://www.python.org/doc/）2. 菜鸟教程 Python 专区（https://www.runoob.com/python/python-tutorial.html）3. W3Schools（https://www.w3schools.com/python/）	1. https://github.com/TheAlgorithms/Python2. https://github.com/jakevdp/PythonDataScienceHandbook	1. B 站尚硅谷 Python 基础（https://www.bilibili.com/video/BV1eW411t7rd/）2. 慕课网 Python 入门到精通（https://www.imooc.com/course/list?c=python）3. Coursera Python for Everybody（https://www.coursera.org/specializations/python）
		1. CV 必备库：NumPy（张量操作）、Matplotlib/Seaborn（可视化）、Scikit-image（图像基础处理）2. 实战要点：图像读取、像素操作、直方图绘制	1. NumPy 官方文档（https://numpy.org/doc/）2. Matplotlib 官网（https://matplotlib.org/stable/index.html）3. Scikit-image 文档（https://scikit-image.org/docs/stable/）	1. https://github.com/numpy/numpy/tree/main/examples2. https://github.com/matplotlib/matplotlib3. https://github.com/scikit-image/scikit-image	1. B 站《Python 数据科学手册》配套讲解（）2. 慕课网 NumPy 实战（https://www.imooc.com/course/list?c=data&keyword=NumPy）
		1. 开发工具：Git/GitHub（版本管理）、Linux 命令行（部署环境）、Anaconda（环境配置）2. CV 工程化：多版本模型管理、跨平台部署	1. GitHub Learning Lab（https://lab.github.com/）2. Linux 公社（https://www.linuxidc.com/）3. Anaconda 官方文档（https://docs.anaconda.com/）	1. https://github.com/git-guides2. https://github.com/justjavac/free-programming-books-zh_CN#linux	1. B 站 Git 零基础入门（https://www.bilibili.com/video/BV1FE411P7B3/）2. Linux 命令行实战教程（https://www.bilibili.com/video/BV1tK4y1C7Bp/）
	CV 入门认知	1. 核心概念：定义、发展历程（传统 CV→深度学习 CV）、应用场景（安防、医疗、自动驾驶）2. 技术流派：传统手工特征 vs 深度学习端到端	1. 机器之心（https://www.jiqizhixin.com/）2. 新智元（https://www.zhidx.com/）3. CS231n 官网（https://cs231n.stanford.edu/）	1. https://github.com/owainlewis/awesome-artificial-intelligence#computer-vision2. https://github.com/jbhuang0604/awesome-computer-vision	1. B 站计算机视觉发展简史（https://www.bilibili.com/search?keyword=%E8%AE%A1%E7%AE%97%E6%9C%BA%E8%A7%86%E8%A7%92%E5%8F%91%E5%B1%95%E7%AE%80%E5%8F%B2）2. 斯坦福 CS231n 2025 导论（https://www.youtube.com/playlist?list=PLoROMvodv4rPzL967YSU5s9pS0w0LwL6j）
		1. 图像基础：像素、分辨率、色彩空间（RGB / 灰度 / HSV）、图像格式（JPG/PNG/BMP）2. 预处理操作：降噪、缩放、裁剪、旋转	1. OpenCV 官网（https://opencv.org/）2. Pillow 官方文档（https://pillow.readthedocs.io/）	1. https://github.com/opencv/opencv/tree/master/samples/python2. https://github.com/python-pillow/Pillow	1. B 站 OpenCV 零基础入门（https://www.bilibili.com/video/BV1PY411e7J6/）2. 慕课网图像预处理实战（）
传统 CV 技术阶段（2-3 个月）	手工特征提取	1. 经典算子：SIFT（尺度不变特征）、SURF（加速版 SIFT）、ORB（二进制特征，开源免费）2. 应用场景：图像匹配、目标定位、全景拼接	1. OpenCV 特征提取文档（https://docs.opencv.org/4.x/da/df5/tutorial_py_sift_intro.html）2. 知乎 CV 专栏（https://www.zhihu.com/topic/19551241）	1. https://github.com/opencv/opencv_contrib/tree/master/modules/xfeatures2d2. https://github.com/alexanderskulikov/siftgpu	1. B 站 SIFT/ORB 算法详解（）2. Coursera 传统 CV 专项课程（https://www.coursera.org/specializations/computer-vision）
		1. 边缘检测：Canny 算子（多阶段边缘提取）、Sobel 算子（梯度边缘）2. 轮廓分析：轮廓提取、面积计算、形状匹配	1. OpenCV 边缘检测文档（https://docs.opencv.org/4.x/da/d22/tutorial_py_canny.html）2. 统计之都 CV 专栏（https://cosx.org/tag/computer-vision/）	1. https://github.com/opencv/opencv/tree/master/samples/python/tutorial_code/ImgTrans/CannyDetector_Demo2. https://github.com/topics/contour-detection	1. B 站 Canny 边缘检测实战（https://www.bilibili.com/video/BV1qK4y1N7VQ/）2. 网易云课堂传统 CV 特征提取（https://study.163.com/course/introduction.htm?courseId=1209487818）
	传统目标检测与分割	1. 目标检测：Haar 级联分类器（人脸检测）、HOG+SVM（行人检测）2. 局限性：对复杂场景鲁棒性差	1. OpenCV Haar 分类器文档（https://docs.opencv.org/4.x/d2/d99/tutorial_js_face_detection.html）2. HOG 官方教程（https://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf）	1. https://github.com/opencv/opencv/tree/master/data/haarcascades2. https://github.com/arnaud-m/hogsvm	1. B 站 Haar 人脸检测实战（https://www.bilibili.com/video/BV1sb411C7Vh/）2. YouTube HOG+SVM 教程（https://www.youtube.com/watch?v=5sGnmkxU79o）
		1. 图像分割：阈值分割（二值化）、区域生长、分水岭算法2. 应用：医学图像病灶分割、工业缺陷检测	1. OpenCV 分割文档（https://docs.opencv.org/4.x/d3/db4/tutorial_py_watershed.html）2. 医学影像分析社区（https://www.medrxiv.org/topic/category/medical-imaging）	1. https://github.com/opencv/opencv/tree/master/samples/python/tutorial_code/ImgTrans/Watershed_Demo2. https://github.com/topics/medical-image-segmentation	1. B 站分水岭算法实战（）2. 慕课网传统图像分割（https://www.imooc.com/course/list?c=data&keyword=%E5%9B%BE%E5%83%8F%E5%88%86%E5%89%B2）
深度学习 CV 核心阶段（3-4 个月）	CNN 基础与经典模型	1. CNN 核心组件：卷积层（特征提取）、池化层（降维）、激活函数（ReLU/Sigmoid）、全连接层（分类）2. 经典模型：LeNet-5（手写数字识别）、AlexNet（深度学习 CV 里程碑）、VGG（小卷积核堆叠）	1. PyTorch CNN 文档（https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html）2. TensorFlow CNN 教程（https://www.tensorflow.org/tutorials/images/cnn）	1. https://github.com/pytorch/vision/tree/main/torchvision/models2. https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py	1. B 站 CNN 原理可视化（https://www.bilibili.com/video/BV17K4y1V7bR/）2. 斯坦福 CS231n CNN 精讲（https://www.youtube.com/playlist?list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv）
		1. 轻量化与高性能模型：ResNet（残差连接解决梯度消失）、MobileNet（深度可分离卷积）、EfficientNet（复合缩放）2. 2025 升级：MobileNetV4（更高精度 + 更低延迟）、EfficientNetV3（适配边缘设备）	1. MobileNet 官方文档（https://arxiv.org/abs/2404.19028）2. EfficientNet 官网（https://github.com/google/automl/tree/master/efficientnet）	1. https://github.com/pytorch/vision/tree/main/torchvision/models/mobilenetv4.py2. https://github.com/google/automl/tree/master/efficientnetv3	1. B 站 ResNet 残差连接详解（https://www.bilibili.com/video/BV1T7411T7wa/）2. 慕课网轻量化 CNN 实战（）
	深度学习目标检测	1. 单阶段算法：YOLO 系列（YOLOv8/YOLOv9 2024 最新版，实时检测）、SSD（多尺度检测）2. 双阶段算法：Faster R-CNN（高精度检测）、Mask R-CNN（检测 + 分割）	1. YOLO 官方文档（https://docs.ultralytics.com/）2. Faster R-CNN 论文（https://arxiv.org/abs/1506.01497）	1. https://github.com/ultralytics/ultralytics（YOLOv8/v9）2. https://github.com/facebookresearch/maskrcnn-benchmark	1. B 站 YOLOv9 实战教程（）2. Coursera 目标检测专项课（https://www.coursera.org/learn/object-detection）
		1. 2025 前沿：DETR（Transformer 目标检测，无锚框）、YOLO-World（开放词汇检测，无需标注）2. 技术突破：跨类别检测、零样本泛化	1. DETR 官方文档（https://github.com/facebookresearch/detr）2. YOLO-World 论文（https://arxiv.org/abs/2401.18460）	1. https://github.com/facebookresearch/detr2. https://github.com/AILab-CVC/YOLO-World	1. B 站 DETR 原理与实战（）2. YouTube YOLO-World 教程（https://www.youtube.com/watch?v=0X20w8VnUx4）
	图像分割与语义理解	1. 语义分割：U-Net（医学影像金标准）、DeepLabv3+/v4（2024 版，空洞卷积 + 注意力）2. 实例分割：Mask R-CNN、YOLACT（实时实例分割）	1. U-Net 论文（https://arxiv.org/abs/1505.04597）2. DeepLab 官网（https://github.com/tensorflow/models/tree/master/research/deeplab）	1. https://github.com/milesial/Pytorch-UNet2. https://github.com/tensorflow/models/tree/master/research/deeplab	1. B 站 U-Net 医学影像分割实战（https://www.bilibili.com/video/BV13a411p7Ej/）2. 慕课网 DeepLabv4 实战（https://www.imooc.com/course/list?c=data&keyword=DeepLabv4）
		1. 全景分割：Panoptic FPN（语义 + 实例分割）2. 视频分割：ViViT-UNet（视频时序语义分割）	1. Panoptic FPN 论文（https://arxiv.org/abs/1901.02446）2. ViViT-UNet 文档（https://arxiv.org/abs/2402.03300）	1. https://github.com/facebookresearch/detectron2/tree/main/projects/PanopticFPN2. https://github.com/topics/video-semantic-segmentation	1. B 站全景分割实战（）2. 浙江大学视频分割课程（https://www.icourse163.org/course/ZJU-1003377027）
CV 前沿技术阶段（3-4 个月）	多模态 CV	1. 核心技术：CLIP（图像 - 文本跨模态匹配）、BLIP-2（多模态生成）、Flamingo（视觉语言大模型）2. 2025 最新：Qwen-VL（通义千问多模态）、GPT-4V（图像理解）	1. CLIP 官网（https://openai.com/research/clip）2. BLIP-2 论文（https://arxiv.org/abs/2301.12597）	1. https://github.com/openai/CLIP2. https://github.com/salesforce/LAVIS/tree/main/projects/blip2	1. B 站 CLIP 原理与实战（）2. YouTube GPT-4V 图像理解教程（https://www.youtube.com/watch?v=Z782P1J7840）
		1. 应用场景：图文检索、视觉问答（VQA）、图像生成（文生图）2. 技术难点：跨模态对齐、长尾数据泛化	1. LAVIS 多模态库（https://github.com/salesforce/LAVIS）2. VQA 挑战赛官网（https://visualqa.org/）	1. https://github.com/facebookresearch/FLAVA2. https://github.com/allenai/allennlp/tree/master/allennlp/models/vision_and_language	1. B 站视觉问答（VQA）实战（https://www.bilibili.com/search?keyword=VQA%20%E8%A7%92%E8%81%94%E9%97%AE%E7%AD%94）2. 慕课网多模态 CV 实战（）
	Transformer 在 CV 中的应用	1. 视觉 Transformer（ViT）：将图像分块输入 Transformer，2025 版 ViT-22B（大模型）2. 混合架构：ConvNeXt（CNN+Transformer，高性能）	1. ViT 官网（https://github.com/google-research/vision_transformer）2. ConvNeXt 论文（https://arxiv.org/abs/2201.03545）	1. https://github.com/google-research/vision_transformer2. https://github.com/facebookresearch/ConvNeXt	1. B 站 ViT 原理可视化（https://www.bilibili.com/video/BV1aT411T75o/）2. 斯坦福 CS231n Transformer CV 精讲（https://www.youtube.com/watch?v=TrdevFK_am4）
		1. 视频 Transformer：ViViT（视频分类）、TimeSformer（时序注意力）2. 应用：行为识别、视频摘要、动作检测	1. ViViT 论文（https://arxiv.org/abs/2103.15691）2. TimeSformer 官网（https://github.com/facebookresearch/TimeSformer）	1. https://github.com/google-research/vision_transformer/blob/main/vit_models/vivit.py2. https://github.com/facebookresearch/TimeSformer	1. B 站视频 Transformer 实战（https://www.bilibili.com/search?keyword=%E8%A7%86%E9%A2%91Transformer%20%E5%AE%9E%E6%88%98）2. Coursera 视频理解专项课（https://www.coursera.org/learn/video-understanding）
	边缘 AI 与轻量化部署	1. 模型压缩：量化（INT8/INT4）、剪枝、知识蒸馏（Teacher-Student 模型）2. 2025 工具：TensorRT（NVIDIA）、ONNX Runtime（跨平台）、昇腾 CANN（华为）	1. TensorRT 官网（https://developer.nvidia.com/tensorrt）2. ONNX Runtime 文档（https://onnxruntime.ai/docs/）	1. https://github.com/NVIDIA/TensorRT2. https://github.com/microsoft/onnxruntime	1. B 站 TensorRT 模型部署实战（https://www.bilibili.com/video/BV1pK4115776/）2. 华为昇腾 CV 部署教程（https://www.huawei.com/cn/ascend/training）
		1. 边缘设备适配：手机（骁龙 AI Engine）、嵌入式（树莓派 / Jetson Nano）、无人机2. 实战要点：低延迟、低功耗、高精度平衡	1. 骁龙 AI 官网（https://www.qualcomm.com/technologies/ai）2. Jetson Nano 文档（https://developer.nvidia.com/embedded/jetson-nano-developer-kit）	1. https://github.com/Qualcomm-AI-research/AI-Edge-Examples2. https://github.com/NVIDIA-AI-IOT/jetson-inference	1. B 站树莓派 CV 实战（https://www.bilibili.com/search?keyword=%E6%A0%91%E8%8E%93%E6%B4%BE%20%E8%AE%A1%E7%AE%97%E6%9C%BA%E8%A7%86%E8%A7%92）2. YouTube Jetson Nano 目标检测（https://www.youtube.com/watch?v=h56M5iUVgGs）
工程化与项目实战阶段（2-3 个月）	CV 项目实战	1. 基础项目：人脸检测与识别、车牌识别、图像去噪2. 进阶项目：自动驾驶目标检测（KITTI 数据集）、医学影像病灶分割（LIDC 数据集）	1. KITTI 数据集官网（http://www.cvlibs.net/datasets/kitti/）2. LIDC 数据集（https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=1966254）	1. https://github.com/ageitgey/face_recognition（人脸识别）2. https://github.com/ultralytics/ultralytics/tree/main/examples/kitti（自动驾驶）	1. B 站人脸检测识别实战（https://www.bilibili.com/video/BV1bt411o7cV/）2. 慕课网医学影像分割项目（https://www.imooc.com/course/list?c=data&keyword=%E5%8C%BB%E5%AD%A6%E5%BD%B1%E5%83%8F%E5%88%86%E5%89%B2）
		1. 前沿项目：多模态视觉问答（VQA）、开放词汇目标检测（YOLO-World）、AIGC 图像编辑（Stable Diffusion+CV）	1. Hugging Face Spaces（https://huggingface.co/spaces）2. Stable Diffusion 官网（https://stability.ai/stable-diffusion）	1. https://github.com/huggingface/transformers/tree/main/examples/pytorch/vision-and-language2. https://github.com/Stability-AI/generative-models	1. B 站 Stable Diffusion CV 编辑实战（https://www.bilibili.com/search?keyword=StableDiffusion%20%E5%9B%BE%E5%83%8F%E7%BC%96%E8%BE%91）2. Coursera CV 大模型实战（https://www.coursera.org/learn/cv-large-models）
	CV 工程化能力	1. 数据工程：数据集标注（LabelImg/LabelMe）、数据增强（Albumentations）、数据集划分2. 模型训练：分布式训练（PyTorch Distributed）、超参数调优（Optuna）	1. LabelImg 官网（https://github.com/HumanSignal/labelImg）2. Albumentations 文档（https://albumentations.ai/docs/）	1. https://github.com/HumanSignal/labelImg2. https://github.com/optuna/optuna	1. B 站 LabelImg 标注实战（https://www.bilibili.com/video/BV1sb411i7aT/）2. 慕课网分布式训练实战（https://www.imooc.com/course/list?c=data&keyword=%E5%88%86%E5%B8%83%E5%BC%8F%E8%AE%AD%E7%BB%83）
		1. 模型部署：Docker 容器化、RESTful API 接口开发、云端部署（AWS/GCP）2. 监控与维护：模型性能监控、版本迭代、故障排查	1. Docker 官网（https://www.docker.com/）2. AWS CV 部署文档（https://aws.amazon.com/cn/machine-learning/mlops/）	1. https://github.com/docker/awesome-compose2. https://github.com/awslabs/amazon-sagemaker-examples	1. B 站 Docker CV 部署实战（https://www.bilibili.com/video/BV1kv411q7Qc/）2. AWS CV 模型部署教程（https://www.youtube.com/watch?v=5q88JgXUAY4）