批量翻译英语关键词到其它小语种
作者:互联网
方法一,使用Selenium模拟Google翻译自动翻译关键词
1.使用selenium模拟打开Google翻译
from selenium import webdriver
import time
driver = webdriver.Chrome()
kw_text = '我爱你'
driver.get('https://translate.google.cn/#view=home&op=translate&sl=auto&tl=en&text=' + kw_text)
time.sleep(3)
ele = driver.find_element_by_css_selector('span[jsname="W297wb"]')
print(ele.text)
2.循环批量翻译
from selenium import webdriver
import re
import time
import random
with open('en.txt', encoding='utf-8') as f:
lines = f.readlines()
driver = webdriver.Chrome()
# driver.maximize_window()
n = 0
every_time_trans_nums = 100
kw_num = len(lines)
while n < kw_num:
kw_text = ''
keywords = lines[n:n + every_time_trans_nums]
for i in keywords:
kw_text += i.replace(' ', '%20') + '%0A'
try:
# tl可以改为es,en,id,fr等
driver.get('https://translate.google.cn/#view=home&op=translate&sl=auto&tl=vi&text=' + kw_text)
time.sleep(3)
ele = driver.find_element_by_css_selector('span[jsname="W297wb')
print(ele.text)
except Exception as e:
print(e)
time.sleep(5)
else:
with open('vn.txt', 'a', encoding='utf-8') as f:
f.write(ele.text + '\n')
finally:
if kw_num - n < every_time_trans_nums:
every_time_trans_nums = kw_num - n
else:
n += 50
if n % 500 == 0:
print('已翻译完成 [%s]' % n)
time.sleep(random.random())
driver.quit()
方法二,使用googletrans库
1.安装googletrans
pip instal googletrans
2.循环翻译关键词
from googletrans import Translator
translator = Translator()
lange = 'en'
with open('Google搜索字词包含sensor.txt', encoding='utf-8') as f:
lines = f.readlines()
for line in lines:
try:
result = translator.translate(line.strip(), dest=lange)
with open(lange + '_Google搜索字词包含sensor.txt', 'a', encoding='utf-8') as f:
f.write(result.text + '\n')
except Exception as e:
with open(lange + '_Google搜索字词包含sensor_error.txt', 'a', encoding='utf-8') as f:
f.write(line.strip() + '\n')
方法三,使用pygtrans库(这个需要国外网络)
1.安装pygtrans
# coding=utf8
from pygtrans import Translate
import sys
import os
client = Translate()
lange = os.path.split(sys.argv[0])[0][-2:]
print('正在翻译语种:' + lange)
keywords = 'dingqinghua.txt'
with open(keywords, encoding='utf-8') as f:
lines = f.readlines()
for line in lines:
try:
text = client.translate(line, target=lange)
with open(lange + '_' + keywords, 'a', encoding='utf-8') as f:
f.write(text.translatedText + '\n')
except Exception as e:
with open('error_'+lange + '_' + keywords, 'a', encoding='utf-8') as f:
f.write(line.strip() + '\n')
经过测试,三种方法有如下优缺点
方法一: 速度最快,但是需要安装selenium,有一定的技术要求,还有一点Google可以检测到使用脚本翻译,翻译结果为直译,和手动翻译结果存在一定差异,如果翻译要求不高,可以使用。
方法二: 需要使用国外网络,如果国外网络断开,翻译结果为机器直接翻译,并不是我想要的,时间长了翻译也会停止,不符合我的要求。
方法三: 这个应该是国内大佬编写,调用的是Google.cn接口,将中文翻译成英文,而我们需求是将英文翻译成其他小语种,因此需要对源码进行部分修改,而且也需要国外网络,我直接在源码中添加了代理,修改如下。
proxies = {'http': '127.0.0.1:10809', 'https': '127.0.0.1:10809'}
def __init__(
self,
target: str = 'en',
source: str = 'auto',
_format='html',
user_agent: str = None,
domain: str = 'com',
proxies: Dict = proxies
):
总结
目前使用的方式三进行翻译,以这样的目录结构存放对应的语种,直接运行脚本即可,会自动以目录为目标语种进行翻译
标签:翻译,批量,语种,text,关键词,lange,kw,time,import 来源: https://blog.csdn.net/cll_869241/article/details/122167183