首页 > 其他分享> > 训练Mask_Rcnn遇到的主要问题

训练Mask_Rcnn遇到的主要问题

2021-10-20 18:31:20 作者：互联网

内容来自其它作者，原文链接：https://blog.csdn.net/weixin_45209827/article/details/115963340

一.数据集
在训练时报错TypeError: Argument ‘bb’ has incorrect type (expected numpy.ndarray, got list)
在这里插入图片描述

网上有些人说是coco包版本太低的问题，或者数据点数应该为偶数等，我查看了自己pycocotools的版本已经为2.0.0，并不是版本问题。然后查看了json文件，发现我用的代码只能训练用多边形标记的，其他类型的标记都会报这个错误。之后重新标记了数据集，成功解决。
另：
ERR0:root:Frar pnrocessing inage {'anotations ': [ ‘box’: […]的问题有可能也是数据集的问题，也有可能是类别设置的不对，需进行修改。

二 Error12无法分配内存
　　训练时遇到了这个错误，应该是内存到极限了，但并未影响服务器的运行
　　看下内存的使用情况，可参考：Error12

三.RuntimeError: It looks like you are subclassing Model and you forgot to call super(YourClass, self).__init__().
　　用多个GPU训练遇到样的错误。
　　在parallel_model.py中的类ParallelModel加入代码：super(YourClass, self).init()

 def __init__(self, keras_model, gpu_count):
        """
            Class constructor.
        :param keras_model: The Keras model to parallelize
        :param gpu_count: gpu 个数，当 gpu 个数 大于 1 时，调用这个对象，启用多 GPU 训练
        """
        super(YourClass, self).__init__()#加入的代码
        self.inner_model = keras_model
        self.gpu_count = gpu_count
        merged_outputs = self.make_parallel()
        super(ParallelModel, self).__init__(inputs=self.inner_model.inputs,
                                            outputs=merged_outputs)

四.卡在epoch1不动
把workers改成1，或者看看图片是否太大

        self.mask_model.keras_model.fit_generator(generator=train_generator,
                                                  initial_epoch=self.epoch,
                                                  epochs=epochs,
                                                  steps_per_epoch=cfg.TRAIN.STEPS_PER_EPOCH,
                                                  callbacks=callbacks,
                                                  validation_data=val_generator,
                                                  validation_steps=cfg.TRAIN.VALIDATION_STEPS,
                                                  max_queue_size=100,
                                                  workers=workers,
                                                  use_multiprocessing=True,
                                                  )

五.程序挂起
　　训练时卡在一个epoch不动，程序不报错，在运行，但就是不继续训练下去了，如果改workers没用的话，查看GPU的利用率

　　发现GPU没有继续工作下去了，将kares换为2.1.6版本，就解决了程序挂起问题

六. FutureWarning: Input image dtype is bool

　　错误信息：

E:\Anaconda3\install\envs\wind_202104\lib\site-packages\skimage\transform\_warps.py:830: FutureWarning: Input image dtype is bool. Interpolation is not defined with bool data type. Please set order to 0 or explicitely cast input image to another data type. Starting from version 0.19 a ValueError will be raised instead of this warning.
  order = _validate_interpolation_order(image.dtype, order)

　　方法：

　　　　调整scikit-image==0.16.2

七. 注意其它库的版本，如numpy，过高或过低都会出现错误

标签：__,训练,self,Mask,init,gpu,Rcnn,model,image
来源： https://www.cnblogs.com/leafchen/p/15430149.html