529
社区成员
太长不看:
zeyuanyin同学的作业完成得很棒,大家如果自己做作业时遇到困难,可以向已经提交作业的同学的优秀代码学习~
比如:https://github.com/zeyuanyin/OpenMMLabCamp/blob/main/homework-4/README.md
目录
1、首先下载以下配置文件并放到对应位置(和config文件提到的位置对应即可):
因为我前几天比较忙,所以作业4看得比较晚,又发现我们班已经有大佬在Github上提前交了作业,且流程写得很详细,大家可以参考https://github.com/zeyuanyin/OpenMMLabCamp/blob/main/homework-4/README.md~)
我在这里也简单总结一下作业4的流程:
https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20230130-mmseg/Dubai/__init__.py
https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20230130-mmseg/Dubai/DubaiDataset.py
使用 MMSegmentation 算法库,撰写 config 配置文件,训练 PSPNet 语义分割算法&提交测试集评估指标
Watermelon87_Semantic_Seg_Mask/img_dir/train/21746.1.jpg -> 21746.jpg
/Watermelon87_Semantic_Seg_Mask/img_dir/val/01bd15599c606aa801201794e1fa30.jpg@1280w_1l_2o_100sh.jpg -> 01bd15599c606aa801201794e1fa30.jpg
因为config文件的路径不一样,以及修改config中的参数的方式不同,zeyuanyin的实现https://github.com/zeyuanyin/OpenMMLabCamp/blob/main/homework-4/code/modify_config.py和我的实现略有区别,参考谁的都行。
我修改后的config:
_base_ = [
'../_base_/models/pspnet_r50-d8.py',
'../_base_/default_runtime.py', '../_base_/schedules/schedule_40k.py'
]
norm_cfg = dict(type='SyncBN', requires_grad=True)
crop_size = (256, 256) # 输入图像尺寸,根据自己数据集情况修改
data_preprocessor = dict(size=crop_size)
model = dict(data_preprocessor=data_preprocessor)
dataset_type = 'DubaiDataset' # 数据集类名
data_root = 'data/Watermelon87_Semantic_Seg_Mask' # 数据集路径(相对于mmsegmentation主目录)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(
type='RandomResize',
scale=(2048, 1024),
ratio_range=(0.5, 2.0),
keep_ratio=True),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='PackSegInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', scale=(2048, 1024), keep_ratio=True),
# add loading annotation after ``Resize`` because ground truth
# does not need to do resize data transform
dict(type='LoadAnnotations'),
dict(type='PackSegInputs')
]
img_ratios = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
tta_pipeline = [
dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
dict(
type='TestTimeAug',
transforms=[
[
dict(type='Resize', scale_factor=r, keep_ratio=True)
for r in img_ratios
],
[
dict(type='RandomFlip', prob=0., direction='horizontal'),
dict(type='RandomFlip', prob=1., direction='horizontal')
], [dict(type='LoadAnnotations')], [dict(type='PackSegInputs')]
])
]
train_dataloader = dict(
batch_size=6,
num_workers=2,
persistent_workers=True,
sampler=dict(type='InfiniteSampler', shuffle=True),
dataset=dict(
type=dataset_type,
data_root=data_root,
data_prefix=dict(
img_path='img_dir/train', seg_map_path='ann_dir/train'),
pipeline=train_pipeline))
val_dataloader = dict(
batch_size=1,
num_workers=4,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
data_prefix=dict(
img_path='img_dir/val', seg_map_path='ann_dir/val'),
pipeline=test_pipeline))
test_dataloader = val_dataloader
val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU'])
test_evaluator = val_evaluator
# training schedule for 40k
train_cfg = dict(type='IterBasedTrainLoop', max_iters=40000, val_interval=200)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=200, log_metric_by_epoch=False),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(type='CheckpointHook', by_epoch=False, interval=200),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='SegVisualizationHook'))
model = dict(
type='EncoderDecoder',
data_preprocessor=data_preprocessor,
pretrained='open-mmlab://resnet50_v1c',
backbone=dict(
type='ResNetV1c',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
dilations=(1, 1, 2, 4),
strides=(1, 2, 1, 1),
norm_cfg=norm_cfg,
norm_eval=False,
style='pytorch',
contract_dilation=True),
decode_head=dict(
type='PSPHead',
in_channels=2048,
in_index=3,
channels=512,
pool_scales=(1, 2, 3, 6),
dropout_ratio=0.1,
num_classes=6,#类别数为6
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=1024,
in_index=2,
channels=256,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=6,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(mode='whole'))
主要修改内容是batchsize和crop_size。
然后如果是使用DubaiDataset.py,可以把这个代码中第九行的类定义部分做修改,其他checkpoint之类的参数也可以修改。
python tools/train.py configs/pspnet/pspnet_r50-d8_4xb2-40k_Watermelon.py
预测结果:
这个部分我参考了zeyuanyin的实现,因为可视化不需要用到runner,所以我注释掉了一行,然后我把一些文件路径改成了符合自己文件的路径,没有其他区别。预测西瓜视频的方法大家也可以直接参考zeyuanyin的实现,我就不再抄一遍了~
from mmengine import Config
import matplotlib.pyplot as plt
import mmcv
from mmengine.runner import Runner
from mmseg.utils import register_all_modules
from mmseg.apis import init_model, inference_model, show_result_pyplot
cfg = Config.fromfile('configs/pspnet/pspnet_r50-d8_4xb2-40k_Watermelon.py')
register_all_modules(init_default_scope=False)
# runner = Runner.from_cfg(cfg)
checkpoint_path = 'iter_18000.pth'
model = init_model(cfg, checkpoint_path, 'cuda:0')
img_path = 'test.png'
img = mmcv.imread(img_path)
result = inference_model(model, img)
pred_mask = result.pred_sem_seg.data[0].cpu().numpy()
plt.imshow(pred_mask)
save_path = img_path.replace('.png', '_pred.png')
plt.savefig(save_path, bbox_inches = 'tight')
print('save_path: ', save_path)
总结:大家除了可以自己想办法做作业,或者看助教的攻略,也可以参考优秀同学的作业代码,和自己的代码比一比,学习一些调参经验~点名zeyuanyin大佬~