Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deeplx是否可以支持同时填入多个API #114

Open
heheda123123 opened this issue Apr 19, 2024 · 8 comments
Open

deeplx是否可以支持同时填入多个API #114

heheda123123 opened this issue Apr 19, 2024 · 8 comments

Comments

@heheda123123
Copy link

https://linux.do/t/topic/60749
如帖子中提到的,沉浸式翻译支持在api处填入以逗号分隔的多个api
但是沉浸式翻译的实现可能有点问题

http://59.110.34.163:85/translate,https://101.132.242.99/translate,https://89.208.240.50/translate,http://67.61.193.42:51004/translate,https://154.18.161.26/translate,https://152.67.197.197/translate

翻译的时候轮询这些api,如果第一次翻译失败会自动用下一个api再尝试一次

@heheda123123
Copy link
Author

糊了个deeplx代理,如果能实现这种效果最好,不然一个API经常会遇到查询太快没有响应的情况
https://linux.do/t/topic/61505/3

@fishjar
Copy link
Owner

fishjar commented Apr 20, 2024

https://linux.do/t/topic/60749 如帖子中提到的,沉浸式翻译支持在api处填入以逗号分隔的多个api 但是沉浸式翻译的实现可能有点问题

http://59.110.34.163:85/translate,https://101.132.242.99/translate,https://89.208.240.50/translate,http://67.61.193.42:51004/translate,https://154.18.161.26/translate,https://152.67.197.197/translate

翻译的时候轮询这些api,如果第一次翻译失败会自动用下一个api再尝试一次

可以做个对url轮寻,问题不大。失败重试也可以,不过可能需要接口在失败时返回非200状态码,这样比较好处理。

@fishjar
Copy link
Owner

fishjar commented Apr 20, 2024

糊了个deeplx代理,如果能实现这种效果最好,不然一个API经常会遇到查询太快没有响应的情况 https://linux.do/t/topic/61505/3

没有linux.do的帐号,看不到你发的链接内容。

@heheda123123
Copy link
Author

糊了个deeplx代理,如果能实现这种效果最好,不然一个API经常会遇到查询太快没有响应的情况 https://linux.do/t/topic/61505/3

没有linux.do的帐号,看不到你发的链接内容。

用法如下
1 把收集到的urls放在`urls.txt`,一行一个,比如下面这样
https://api.deeplx.org/
https://deeplx.papercar.top/
https://deepl.dlwlrma.xyz/

2 运行代理 
python xx.py
依赖
pip install gevent flask requests

3 设置沉浸式翻译里面的接口为
http://127.0.0.1:5000/translate

代码原理如下
1 启动时判断API有效性
2 每次翻译查询,由代理随机选择api进行查询,不行就切换,直到获得查询结果(尝试10次)

 
import random

import gevent
from gevent.pool import Pool
from gevent import monkey
from gevent.pywsgi import WSGIServer

monkey.patch_all()

import requests

requests.packages.urllib3.disable_warnings(
    requests.packages.urllib3.exceptions.InsecureRequestWarning
)
from flask import Flask, request
import json

app = Flask(__name__)

valid_urls = []


def check_url_availability(url):
    global valid_urls
    try:
        headers = {"Content-Type": "application/json"}
        payload = {
            "text": "Hello, world!",
            "source_lang": "EN",
            "target_lang": "ZH"
        }
        response = requests.post(url, verify=False, timeout=5, headers=headers,
                                 data=json.dumps(payload))
        if "你好,世界" in response.text:
            valid_urls.append(url)
    except Exception as e:
        print('%s: %s' % (url, type(e).__name__))


def get_valid_urls():
    with open("urls.txt", "r") as f:
        urls = f.read().splitlines()

    for i in range(len(urls)):
        urls[i] += "translate"
    urls = list(set(urls))
    p = Pool(50)
    jobs = [p.spawn(check_url_availability, _url) for _url in urls]

    gevent.joinall(jobs)


get_valid_urls()
print("available urls count: {}".format(len(valid_urls)))


def get_translate_data(text, source_lang, target_lang):
    count = 0
    while True:
        urls = random.choice(valid_urls)
        count += 1
        if count == 10:
            break
        try:
            headers = {"Content-Type": "application/json"}
            payload = {
                "text": text,
                "source_lang": source_lang,
                "target_lang": target_lang
            }
            response = requests.post(urls, verify=False, timeout=5, headers=headers,
                                     data=json.dumps(payload))
            data = response.json()
            if data["code"] == 200:
                return response.text
        except Exception as e:
            print('%s' % (type(e).__name__))


@app.route('/translate', methods=['POST'])
def translate():  # put application's code here
    data = json.loads(request.get_data())
    text = data['text']
    source_lang = data['source_lang']
    target_lang = data['target_lang']
    return get_translate_data(text, source_lang, target_lang)


if __name__ == '__main__':
    http_server = WSGIServer(("127.0.0.1", 5000), app)
    http_server.serve_forever()

@heheda123123
Copy link
Author

heheda123123 commented Apr 21, 2024

更新了下代码,之前是查询十次,直到获得翻译结果。现在是开3个任务并发查询,有一个返回就拿到翻译结果了
这样翻译速度快很多

import random

import gevent
from gevent.pool import Pool
from gevent import monkey
from gevent.pywsgi import WSGIServer

monkey.patch_all()

import requests

requests.packages.urllib3.disable_warnings(
    requests.packages.urllib3.exceptions.InsecureRequestWarning
)
from flask import Flask, request
import json

app = Flask(__name__)

valid_urls = []


def check_url_availability(url):
    global valid_urls
    try:
        headers = {"Content-Type": "application/json"}
        payload = {
            "text": "Hello, world!",
            "source_lang": "EN",
            "target_lang": "ZH"
        }
        response = requests.post(url, verify=False, timeout=5, headers=headers,
                                 data=json.dumps(payload))
        if "你好,世界" in response.text:
            valid_urls.append(url)
    except Exception as e:
        print('%s: %s' % (url, type(e).__name__))


def get_valid_urls():
    with open(R"urls.txt", "r") as f:
        urls = f.read().splitlines()

    for i in range(len(urls)):
        urls[i] += "translate"
    urls = list(set(urls))
    p = Pool(200)
    jobs = [p.spawn(check_url_availability, _url) for _url in urls]

    gevent.joinall(jobs)


get_valid_urls()
print("available urls count: {}".format(len(valid_urls)))

def single_translate(text, source_lang, target_lang):
    for i in range(10):
        urls = random.choice(valid_urls)
        try:
            headers = {"Content-Type": "application/json"}
            payload = {
                "text": text,
                "source_lang": source_lang,
                "target_lang": target_lang
            }
            response = requests.post(urls, verify=False, timeout=5, headers=headers,
                                     data=json.dumps(payload))
            data = response.json()
            if data["code"] == 200:
                return response.text
        except Exception as e:
            print('%s' % (type(e).__name__))

def get_translate_data(text, source_lang, target_lang):
    tasks = [gevent.spawn(single_translate, text, source_lang, target_lang) for _ in range(3)]
    done = gevent.wait(tasks, count=1)
    for t in tasks:
        t.kill()
    return done.pop().value


@app.route('/translate', methods=['POST'])
def translate():  # put application's code here
    data = json.loads(request.get_data())
    text = data['text']
    source_lang = data['source_lang']
    target_lang = data['target_lang']
    return get_translate_data(text, source_lang, target_lang)


if __name__ == '__main__':
    http_server = WSGIServer(("127.0.0.1", 5000), app)
    http_server.serve_forever()

@fishjar
Copy link
Owner

fishjar commented Apr 21, 2024

更新了下代码,之前是查询十次,直到获得翻译结果。现在是开3个任务并发查询,有一个返回就拿到翻译结果了 这样翻译速度快很多

并发的缺点是,会使得翻译接口更容易达到频次限制。

@fishjar
Copy link
Owner

fishjar commented Apr 21, 2024

v1.8.8 将支持deeplx的多url轮寻,不过写死了3次重试,如果连续3个url都返回错误将翻译失败。

@heheda123123
Copy link
Author

heheda123123 commented Apr 21, 2024

更新了下代码,之前是查询十次,直到获得翻译结果。现在是开3个任务并发查询,有一个返回就拿到翻译结果了 这样翻译速度快很多

并发的缺点是,会使得翻译接口更容易达到频次限制。

deeplx的好处就在这点,可用公开节点很多,我随便收集了下,有269个能用的
量大管饱(:,沉浸式翻译里面我设置的,每秒请求数20,3个段落。配合我上面的代理,用起来翻译速度也快
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants