首页 > 其他分享> > 提取文本中的汉字

提取文本中的汉字

2022-04-14 14:00:31 作者：互联网

提取文本中的汉字

点击查看代码

    def extract_chinese_character(self, fields):
        """
        提取文本中的汉字
        ^[\u4E00-\u9FFF]+$   匹配简体和繁体

        """
        try:
            text = fields.get('text')
            style = fields.get('style')
            if int(style):
                # res = re.findall(r'[\u2E80-\u9FFF]+', text)
                res = re.findall(r'[\u4E00-\u9FFF]+', text)
            else:
                data = re.findall(r'[\u4E00-\u9FFF]+', text)
                res = ''.join(map(str, data))
            result = {'Code': 2000, 'Msg': 'Success', 'Data': res}
        except Exception as e:
            result = {'Code': 2001, 'Msg': 'fail', "Data": e}
            logger.error(f"提取文本中的汉字失败！--{e}")
        return result

标签：u4E00,提取,text,u9FFF,汉字,result,res,文本,findall
来源： https://www.cnblogs.com/QiaoPengjun/p/16144200.html