FDU千寻 Moz1 叠衣服方案设计及比赛过程分享

Leonardo 领队 2026-01-21 10:06:04

我们在 Moz1 叠衣服任务中的核心目标并不是追求一次性解决“任意衣物”的泛化问题,而是希望先把叠衣服这件事本身解决得足够好。这里的“好”被我们非常工程化地定义为:从一件衣服出发,暂时不优先考虑泛化性,在真实场景中取得尽可能高的实际成功率。基于这一目标,我们选择了从单一衣物入手,逐步爬坡,而不是一开始就设计高度复杂的统一方案。

在方案设计上,我们对齐比赛设置的三个等级(单一 T 恤、多种衣物、任意形态衣物),采用了逐级能力构建的思路,整体以 VLA(Vision-Language-Action)based folding 作为核心基模。早期重点放在可控任务上,围绕展开状态的 T 恤与长袖衣物进行真实数据采集与折叠示教,构建折叠数据闭环,为后续模型训练和实机验证提供稳定基础。同时,我们并行探索了 3D 衣物 mesh 的生成与建模,引入褶皱检测来刻画衣物状态,为判断“是否具备可折叠性”提供更明确的结构信号。

随着实验推进,我们逐渐发现,叠衣服任务中的主要失败来源往往并不发生在 folding 动作本身,而是源于衣物初始状态不可控、自遮挡严重或褶皱过多。因此,我们进一步思考了基于 affordance 的 garment flatten 思路,希望能够从任意形态衣物出发,识别可操作区域并将其拉展到一个标准、可折叠的状态。但在有限的比赛周期内,我们目前完成了衣物褶皱检测、单件展平衣物折叠的初步验证,未来希望进一步验证衣物flatten 触发逻辑。

在系统实现层面,我们完成了三色短袖 T 恤的数据采集,训练并测试了 XVLA folding 模型,并结合 GR00T N1.5以及Motus 进行动作生成与控制稳定性的提升。在当前设定下,该系统在单一衣物任务中表现出较高的成功率,也验证了“先收敛一个可靠子问题”的路线是可行的。

在Motus模型上进行T恤折叠finetune,测试并开源该checkpoints:https://huggingface.co/Star-UU-Wang/motus_moz1

以下为在研究过程中,我们收集并讨论的一些论文/技术方案:
Manipulation Specific for Cloth Manipulation

Project

Code?

 

Venue/Year

 

Data

 

Sim/Real

Overview

 

Category

Contributions & Limitations/Challenge

 

BiFold: Bimanual Cloth Folding with Language Guidance

https://arxiv.org/pdf/2505.07600

https://github.com/Barbany/bifold

ICRA 2025

VR-Folding

Sim

 

Language-guided folding using ViT, CLOTH3D rendering

VLA

 

General-purpose Clothes Manipulation with Semantic Keypoints

 

ICRA 2025

SoftGym, CLOTH3D

Sim

Keypoints + LLM planning + action primitives (folding included)

VLA

 

APS-Net

https://arxiv.org/abs/2506.22769

 

arXiv 2025

Collected demos

Real

Standardization + folding pipeline with reward shaping

VA

 

FoldNet

https://arxiv.org/pdf/2505.09109

 

arXiv 2025

Synthetic demonstrations

Sim

Keypoint-driven folding policy from templates

VA

 

MetaFold

 

 

2025.03

MetaFold Dataset, DiffClothAI

Sim

LLM + CVAE for trajectory generation between keypoints

VLA

 

SSFold

https://arxiv.org/abs/2411.02608

 

arXiv 2024

Human demo (sim to real)

Real

 

Graph dynamics + folding generalization

VA

 

UniFolding

 

 

CoRL 2023

 

Real

Sample-efficient, scalable folding

VA

 

Foldsformer

https://arxiv.org/abs/2301.03003

 

arXiv 2023

 

Sim → Real

Space-time transformer for multi-step folding

VLA

 

Dual-arm Hem Folding

 

ROBOMECH 2023

 

Real

Four-step hem folding with real clothes

VA

 

SpeedFolding

https://arxiv.org/abs/2208.10552

 

IROS 2022

Collected from 4300+ actions

Real

Efficient dual-arm folding pipeline

VA

 

Keypoints from Synthetic Data for Cloth Folding

https://arxiv.org/abs/2205.06714

 

ICRA Workshop 2022

Synthetic

Real

CNN keypoint detector for towel folding

VA

 

1hr Real RL Fabric Folding

https://proceedings.mlr.press/v155/lee21a/lee21a.pdf

 

MLR 2019

 

Real

Self-supervised RL in 1hr for goal-conditioned folding

RL

 

Gravity-Based Robotic Cloth Folding

 

2010

 

Real

Classic geometric g-fold algorithm

VA

 

FabricFolding: Learning Efficient Fabric Folding without Expert Demonstrations

https://arxiv.org/abs/2303.06587

 

Robotica 2024

Fabric Keypoint Dataset (~1800 RGB-D images)

Real

Dual-stage: unfold with hybrid actions, then fold using keypoint heuristic; 88–92% success; no expert demos

 

 

GarmNet: Improving Global with Local Perception for Robotic Laundry Folding

https://arxiv.org/abs/1907.00408

 

2019

RGB-D

Real

Landmark detection and global context model for folding

VA

 

Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation

https://arxiv.org/abs/2403.00213

 

2025.03

 

Sim

Diffusion model predicts cloth dynamics; includes folding tasks; generative state estimator

VA

 

Dynamic Cloth Folding Using Curriculum Learning

https://www.researchgate.net/publication/378427985

 

2023

 

Sim

Trains folding agents using curriculum RL; handles long-sleeved garments in simulation

RL

 

π₀: A Vision-Language-Action Flow Model for General Robot Control

https://arxiv.org/abs/2504.16054

 

2024.10

OXE + 7 robots

Real

Flow-matching + PaliGemma VLM; folding included in task suite

VLA

 

π₀.₅: Open-world Generalization for Vision-Language-Action Robots

https://arxiv.org/abs/2504.16054

 

2025.05

Multi-robot + web data

Real

Successor to π₀; general-purpose VLA model including household tasks

VLA

 

Learning Visual Feedback Control for Dynamic Cloth Folding (IROS 2022)

https://arxiv.org/abs/2109.04771

https://github.com/hietalajulius/dynamic-cloth-folding

2021.09

Sim + Real (Franka, D435)

Real

RL-based visual feedback control; trained in sim, transferred to real; dynamic square cloth folding

RL

 

AdaFold: Adapting Folding Trajectories via Feedback-loop Manipulation

https://arxiv.org/abs/2403.06210

https://github.com/albiLo17/Adafold

2024.03

Sim + Real

 

Real

 

MPC controller adapts cloth folding; uses semantic features & point cloud feedback; generalizes to new cloths

VA

 

 
最后,期待与各赛队东莞见,具身智能,有你才能!

...全文
79 7 打赏 收藏 转发到动态 举报
写回复
7 条回复
切换为时间正序
请发表友善的回复…
发表回复
7k1k 队员 01-23 17:18
  • 打赏
  • 举报
回复

感谢分享

Adnachiel03 01-22 20:14
  • 打赏
  • 举报
回复

感谢分享

fanna123123 队员 01-22 20:07
  • 打赏
  • 举报
回复

牛的

江淼98 队员 01-22 20:05
  • 打赏
  • 举报
回复
感谢分享
thucyx 队员 01-22 17:22
  • 打赏
  • 举报
回复

很有帮助!

fanna123123 队员 01-22 15:24
  • 打赏
  • 举报
回复

牛的

2301_80744354 助教 01-21 10:23
  • 打赏
  • 举报
回复

感谢分享!

92

社区成员

发帖
与我相关
我的任务
社区描述
「智能机器人开发者大赛」官方平台,致力于为开发者和参赛选手提供赛事技术指导、行业标准解读及团队实战案例解析;聚焦智能机器人开发全栈技术闭环,助力开发者攻克技术瓶颈,促进软硬件集成、场景应用及商业化落地
机器人人工智能 高校
社区管理员
  • CSDN产品汪
  • Zachary_86
  • NO.2社区助手
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告

「智能机器人开发者大赛」官方平台,致力于为开发者和参赛选手提供赛事技术指导、行业标准解读及团队实战案例解析;聚焦智能机器人开发全栈技术闭环,助力开发者攻克技术瓶颈,促进软硬件集成、场景应用及商业化落地