其他分享
首页 > 其他分享> > nnUNet 训练自己的数据

nnUNet 训练自己的数据

作者:互联网

环境配置

1.环境配置

conda create -n nnUNet python=3.7

conda activate nnUNet

切换下载源


#查看当前conda配置
conda config --show channels
 
#设置通道
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msys2/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
 
#设置搜索是显示通道地址
conda config --set show_channel_urls yes
 
conda install pytorch torchvision cudatoolkit=10.0  # 删除安装命令最后的 -c pytorch,才会采用清华源安装。

前往下载服务器最适合的CUDA版本

PyTorch

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

2. 安装hiddenlayer

pip install --upgrade git+https://github.com/nanohanno/hiddenlayer.git@bugfix/get_trace_graph#egg=hiddenlayer

3. 下载安装nnUNet

新建nnUNetFrame文件夹,将nnUNet拉到本地

git clone https://github.com/MIC-DKFZ/nnUNet.git
cd nnUNet
pip install -e .

检查torch CUDA是否可用,返回True代表可用

python

import torch
print(torch.cuda.is_available())

4. 设置nnUNet默认文件路径进入bashrc文件 vim ~/.bashrc

export nnUNet_raw_data_base="/home/hj/nnUNetFrame/DATASET/nnUNet_raw"
export nnUNet_preprocessed="/home/hj/nnUNetFrame/DATASET/nnUNet_preprocessed"
export RESULTS_FOLDER="/home/hj/nnUNetFrame/DATASET/nnUNet_trained_models"

source ~/.bashrc 保存并生效

训练自己的模型

1. 构建自己的数据集

在nnUNetFrame文件夹下新建DATASET文件夹以及nnUNet_raw文件夹,按照需要新建自己的任务数据集,命名规则需要使用 Task××_, ××是个两位数.然后再对应的文件夹下放入自己的数据,ImagesTr和labelsTr中必须要有数据!

所有的数据需要按照要求转成固定nii.gz的格式.如下图

 2. 生成自己的dataset.json

import os
from batchgenerators.utilities.file_and_folder_operations import save_json, subfiles
from typing import Tuple
import numpy as np

def get_identifiers_from_splitted_files(folder: str):
    uniques = np.unique([i[:-7] for i in subfiles(folder, suffix='.nii.gz', join=False)])
    return uniques

def generate_dataset_json(output_file: str, imagesTr_dir: str, imagesTs_dir: str, modalities: Tuple,
                          labels: dict, dataset_name: str, license: str = "Hebut AI", dataset_description: str = "",
                          dataset_reference="oai-zib", dataset_release='11/2021'):
    """
    :param output_file: This needs to be the full path to the dataset.json you intend to write, so
    output_file='DATASET_PATH/dataset.json' where the folder DATASET_PATH points to is the one with the
    imagesTr and labelsTr subfolders
    :param imagesTr_dir: path to the imagesTr folder of that dataset
    :param imagesTs_dir: path to the imagesTs folder of that dataset. Can be None
    :param modalities: tuple of strings with modality names. must be in the same order as the images (first entry
    corresponds to _0000.nii.gz, etc). Example: ('T1', 'T2', 'FLAIR').
    :param labels: dict with int->str (key->value) mapping the label IDs to label names. Note that 0 is always
    supposed to be background! Example: {0: 'background', 1: 'edema', 2: 'enhancing tumor'}
    :param dataset_name: The name of the dataset. Can be anything you want
    :param license:
    :param dataset_description:
    :param dataset_reference: website of the dataset, if available
    :param dataset_release:
    :return:
    """
    train_identifiers = get_identifiers_from_splitted_files(imagesTr_dir)

    if imagesTs_dir is not None:
        test_identifiers = get_identifiers_from_splitted_files(imagesTs_dir)
    else:
        test_identifiers = []

    json_dict = {}
    json_dict['name'] = "Breast"
    json_dict['description'] = "Breast"
    json_dict['tensorImageSize'] = "3D"
    json_dict['reference'] = dataset_reference
    json_dict['licence'] = license
    json_dict['release'] = dataset_release
    json_dict['modality'] = {"0": "CT"}
    json_dict['labels'] = {
        "0": "background",
        "1": "Breast"
    }

    json_dict['numTraining'] = len(train_identifiers)
    json_dict['numTest'] = len(test_identifiers)
    json_dict['training'] = [
        {'image': "./imagesTr/%s.nii.gz" % i, "label": "./labelsTr/%s.nii.gz" % i} for i
        in
        train_identifiers]
    json_dict['test'] = ["./imagesTs/%s.nii.gz" % i for i in test_identifiers]

    output_file += "dataset.json"
    if not output_file.endswith("dataset.json"):
        print("WARNING: output file name is not dataset.json! This may be intentional or not. You decide. "
              "Proceeding anyways...")
    save_json(json_dict, os.path.join(output_file))


if __name__ == "__main__":


    
    output_file = r'/home/hj/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task58_Breast/'
    imagesTr_dir = r'/home/hj/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task58_Breast/imagesTr'
    imagesTs_dir = r'/home/hj/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task58_Breast/imagesTs'
    labelsTr = r'/home/hj/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task58_Breast/labelsTr'
    
    modalities = '"0": "CT"'
    labels = {
        "0": "background",
        "1": "Breast"
    }

    get_identifiers_from_splitted_files(output_file)
    generate_dataset_json(output_file,
                          imagesTr_dir,
                          imagesTs_dir,
                          labelsTr,
                          modalities,
                          labels
                          )


3.转换自己的数据集 

nnUNet_convert_decathlon_task -i /home/hj/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task58_Breast

4. 预处理数据

nnUNet_plan_and_preprocess -t 58

5.开始训练

nnUNet_train 3d_fullres nnUNetTrainerV2 58 4

nnUNet_train 3d_fullres nnUNetTrainerV2 58 4 -c 中间断了继续训练

自己的数据预测推理

nnUNet_predict -i /home/你的主机用户名/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task058_Breast/imagesTs/ -o /home/你的主机用户名/nnUNetFrame/DATASET/nnUNet_raw/nnUNet_raw_data/Task058_Breast/inferTs -t 8 -m 3d_fullres -f 4

 

标签:训练,nnUNet,dataset,raw,json,dict,conda,数据
来源: https://blog.csdn.net/qq_36720719/article/details/121821479