首页 > 其他分享> > 关于STL容器vector与移动构造函数的小坑

关于STL容器vector与移动构造函数的小坑

2021-05-02 00:00:03 作者：互联网

最近在做一个线程类，考虑到将来会用STL容器来存放线程类的对象（后来思考了一下存智能指针也许会更好，详见后文分析），因此在设计线程类的时候，就主动声明了复制构造函数和左值引用赋值运算符为delete。然后手动实现了移动构造函数和右值引用赋值运算符，大概如下：

//因为考虑到线程类可能没有动态多态的需求，就用CRTP实现了静态多态
//头文件就略了
template<class Derived>
class ThreadBase
{
protected:
    std::unique_ptr<std::thread> thread_;
    std::string threadName_;
    bool isRunning_;

    Derived* cast()
    {
        return static_cast<Derived*>(this);
    }
    Derived* cast() const
    {
        return static_cast<const Derived*>(this);
    }
    ThreadBase(std::string name)
        : threadName_(name)
        , isRunning_(false)
    {}
    ThreadBase(const ThreadBase &) = delete;
    ThreadBase &operator=(const ThreadBase &) = delete;
    ThreadBase(ThreadBase&& rhs)
        : thread_(std::move(rhs.thread_))
        , threadName_(std::move(rhs.threadName_))
        , isRunning_(rhs.isRunning_)
    {
        std::cout << "thread base moved\n";
    }
    ThreadBase &operator=(ThreadBase&& rhs)
    {
        std::cout << "thread base move assigned\n";
        thread_ = std::move(rhs.thread_);
        threadName_ = std::move(rhs.threadName_);
        isRunning_ = rhs.isRunning_;
        return *this;
    }
    ~ThreadBase()
    {
        if (!thread_)
        {
            std::cout << "thread_ null ptr\n";
            return ;
        }
        if (thread_->joinable())
        {
            thread_->join();
        }
        else
        {
            std::cout << "thread not joinable\n";
        }
    }
    void routine()
    {
        cast()->routine();
    }
public:
    friend Derived; //以便派生类使用基类的构造函数，基类不允许构造对象
    void start()
    {
        isRunning_ = true;
        thread_.reset(new std::thread(std::bind(&ThreadBase<Derived>::routine, this)));
    }
    void stop()
    {
        isRunning_ = false;
    }
};

std::mutex globalMutex; //测试需要，保证输出的顺序不乱

class D : public ThreadBase<D>
{
public:
    friend ThreadBase<D>; //允许基类调用派生类的非公开方法
    D(const std::string& name)
        : ThreadBase<D>(name)
    {}
    D(D&&) = default;

protected:
    void routine()
    {
        while (isRunning_)
        {
            {
                std::lock_guard<std::mutex> lg(globalMutex);
                std::cout << threadName_ << " is working 1s...\n";
            }
            std::this_thread::sleep_for(std::chrono::seconds(1));
        }
    }
};

测试用的代码如下：

int main(int argc, const char** argv) 
{
    std::vector<D> vThreads;
    for (int i = 0; i < 5; ++i)
    {
        std::string name = "thread[$]";
        name[name.find('$')] = static_cast<char>('0' + i);
        //vThreads.push_back(D(name));
        vThreads.emplace_back(name);
        std::cout << "emplace end\n";
    }
    std::cout << "vector end\n";
    for (auto& th : vThreads)
    {
        th.start();
    }
    std::cout << "start end\n";
    std::cout << "main thread sleeping for 5 seconds\n";
    std::this_thread::sleep_for(std::chrono::seconds(5));
    for (auto& th : vThreads)
    {
        th.stop();
    }
    std::cout << "stop end\n";
    return 0;
}

编译运行后惊奇的发现，每次调用emplace_back的时候，之前每插入过一个元素，移动构造函数就会多调用一次。

╰─± ./a.out 
thread base moved
thread_ null ptr
emplace end
thread base moved
thread base moved
thread_ null ptr
thread_ null ptr
emplace end
thread base moved
thread base moved
thread base moved
thread_ null ptr
thread_ null ptr
thread_ null ptr
emplace end
thread base moved
thread_ null ptr
emplace end
thread base moved
thread base moved
thread base moved
thread base moved
thread base moved
thread_ null ptr
thread_ null ptr
thread_ null ptr
thread_ null ptr
thread_ null ptr
emplace end

当for循环第五次调用emplace_back的时候，屏幕会输出5次thread base move assigned并且输出5次析构函数的内容。
这个结果让我很是奇怪，按理说，emplace_back每次只会移动构造一个线程对象，结果却并非想象的那样。那么到底哪里出了问题呢？
通过gdb跟踪每次调用移动构造函数发现，当容器中的元素多与1个的时候，调用栈中都出现了一个奇怪的东西：
在这里插入图片描述
就是vector中的realloc，莫非是因为容器的初始大小不够，在扩容的时候又产生了移动？
在StackOverflow上找了一下相关的问题，发现确实如此。由于vector的初始容量不够，因此在扩容的时候产生了复制/移动操作。于是我就在for循环之前，提前调用了vector<T>::reserve方法，预留了5个空间给线程类，然后编译再次运行。结果就很正常了，每次emplace_back都只调用了一次移动构造函数，符合预期。

总结思考

由于容器容量不够而导致扩容的问题，可能会引发很严重的后果：

以vector为例，如果调用push_back或者emplace_back时，由于容量不够，而触发动态扩容的时候，会将原来存储的对象全部复制或者移动（如果对象有移动构造函数）到扩容后的空间中去，此之谓"reallocation"；
那么当容器内的对象数量非常多的时候，这些复制的开销将会非常大。
不仅push_back和emplace_back操作可能导致的扩容开销，还有当我们在vector的任意位置插入删除元素（非尾部），都会导致O(n)复杂度的元素移动，这种移动也会导致大量的复制开销。
**后来又想了一下，貌似顺序型容器只有vector有这种问题，其他的顺序型容器应该不存在扩容的情况。比如deque就不会因扩容导致大量元素复制，但是在deque中间插入删除元素也会导致大量元素移位而产生复制/移动开销。

因此，当我们使用vector的时候，最好提前使用reserve方法，预留足够的空间，防止扩容导致的复制操作。如果对象本身提供了移动构造的话，相对来说开销没那么大。
另一方面，如果用vector来存对象，看起来貌似不是特别好，因此正如我开头所说，如果使用智能指针来管理对象资源，那么vector内只需要存放智能指针对象，而非裸对象。
当容量不够时，扩容操作仅会对智能指针对象发生复制或移动（如果是std::unique_ptr那么应该是移动）。
特别是当你需要管理的对象占据空间比较大的时候，用智能指针来管理对象资源并用容器来收集这些资源，会更加节省开销。
另外，如果一定要管理裸对象，我认为用std:list会更好，同样属于顺序型容器，std::list相当于链表，不存在所谓的固定容量，当所需管理的对象资源数量不确定的情况下，用std::list相对会节省开销。只是std::list不提供随机访问的operator[]，面对具体问题需要做一些取舍。

标签：std,emplace,thread,STL,ThreadBase,vector,base,ptr,构造函数
来源： https://blog.csdn.net/wbvalid/article/details/116333853