You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

model_load_balancing_service.py 25KB

преди 10 месеца
Introduce Plugins (#13836) Signed-off-by: yihong0618 <zouzou0208@gmail.com> Signed-off-by: -LAN- <laipz8200@outlook.com> Signed-off-by: xhe <xw897002528@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: takatost <takatost@gmail.com> Co-authored-by: kurokobo <kuro664@gmail.com> Co-authored-by: Novice Lee <novicelee@NoviPro.local> Co-authored-by: zxhlyh <jasonapring2015@outlook.com> Co-authored-by: AkaraChen <akarachen@outlook.com> Co-authored-by: Yi <yxiaoisme@gmail.com> Co-authored-by: Joel <iamjoel007@gmail.com> Co-authored-by: JzoNg <jzongcode@gmail.com> Co-authored-by: twwu <twwu@dify.ai> Co-authored-by: Hiroshi Fujita <fujita-h@users.noreply.github.com> Co-authored-by: AkaraChen <85140972+AkaraChen@users.noreply.github.com> Co-authored-by: NFish <douxc512@gmail.com> Co-authored-by: Wu Tianwei <30284043+WTW0313@users.noreply.github.com> Co-authored-by: 非法操作 <hjlarry@163.com> Co-authored-by: Novice <857526207@qq.com> Co-authored-by: Hiroki Nagai <82458324+nagaihiroki-git@users.noreply.github.com> Co-authored-by: Gen Sato <52241300+halogen22@users.noreply.github.com> Co-authored-by: eux <euxuuu@gmail.com> Co-authored-by: huangzhuo1949 <167434202+huangzhuo1949@users.noreply.github.com> Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com> Co-authored-by: lotsik <lotsik@mail.ru> Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com> Co-authored-by: nite-knite <nkCoding@gmail.com> Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: gakkiyomi <gakkiyomi@aliyun.com> Co-authored-by: CN-P5 <heibai2006@gmail.com> Co-authored-by: CN-P5 <heibai2006@qq.com> Co-authored-by: Chuehnone <1897025+chuehnone@users.noreply.github.com> Co-authored-by: yihong <zouzou0208@gmail.com> Co-authored-by: Kevin9703 <51311316+Kevin9703@users.noreply.github.com> Co-authored-by: -LAN- <laipz8200@outlook.com> Co-authored-by: Boris Feld <lothiraldan@gmail.com> Co-authored-by: mbo <himabo@gmail.com> Co-authored-by: mabo <mabo@aeyes.ai> Co-authored-by: Warren Chen <warren.chen830@gmail.com> Co-authored-by: JzoNgKVO <27049666+JzoNgKVO@users.noreply.github.com> Co-authored-by: jiandanfeng <chenjh3@wangsu.com> Co-authored-by: zhu-an <70234959+xhdd123321@users.noreply.github.com> Co-authored-by: zhaoqingyu.1075 <zhaoqingyu.1075@bytedance.com> Co-authored-by: 海狸大師 <86974027+yenslife@users.noreply.github.com> Co-authored-by: Xu Song <xusong.vip@gmail.com> Co-authored-by: rayshaw001 <396301947@163.com> Co-authored-by: Ding Jiatong <dingjiatong@gmail.com> Co-authored-by: Bowen Liang <liangbowen@gf.com.cn> Co-authored-by: JasonVV <jasonwangiii@outlook.com> Co-authored-by: le0zh <newlight@qq.com> Co-authored-by: zhuxinliang <zhuxinliang@didiglobal.com> Co-authored-by: k-zaku <zaku99@outlook.jp> Co-authored-by: luckylhb90 <luckylhb90@gmail.com> Co-authored-by: hobo.l <hobo.l@binance.com> Co-authored-by: jiangbo721 <365065261@qq.com> Co-authored-by: 刘江波 <jiangbo721@163.com> Co-authored-by: Shun Miyazawa <34241526+miya@users.noreply.github.com> Co-authored-by: EricPan <30651140+Egfly@users.noreply.github.com> Co-authored-by: crazywoola <427733928@qq.com> Co-authored-by: sino <sino2322@gmail.com> Co-authored-by: Jhvcc <37662342+Jhvcc@users.noreply.github.com> Co-authored-by: lowell <lowell.hu@zkteco.in> Co-authored-by: Boris Polonsky <BorisPolonsky@users.noreply.github.com> Co-authored-by: Ademílson Tonato <ademilsonft@outlook.com> Co-authored-by: Ademílson Tonato <ademilson.tonato@refurbed.com> Co-authored-by: IWAI, Masaharu <iwaim.sub@gmail.com> Co-authored-by: Yueh-Po Peng (Yabi) <94939112+y10ab1@users.noreply.github.com> Co-authored-by: Jason <ggbbddjm@gmail.com> Co-authored-by: Xin Zhang <sjhpzx@gmail.com> Co-authored-by: yjc980121 <3898524+yjc980121@users.noreply.github.com> Co-authored-by: heyszt <36215648+hieheihei@users.noreply.github.com> Co-authored-by: Abdullah AlOsaimi <osaimiacc@gmail.com> Co-authored-by: Abdullah AlOsaimi <189027247+osaimi@users.noreply.github.com> Co-authored-by: Yingchun Lai <laiyingchun@apache.org> Co-authored-by: Hash Brown <hi@xzd.me> Co-authored-by: zuodongxu <192560071+zuodongxu@users.noreply.github.com> Co-authored-by: Masashi Tomooka <tmokmss@users.noreply.github.com> Co-authored-by: aplio <ryo.091219@gmail.com> Co-authored-by: Obada Khalili <54270856+obadakhalili@users.noreply.github.com> Co-authored-by: Nam Vu <zuzoovn@gmail.com> Co-authored-by: Kei YAMAZAKI <1715090+kei-yamazaki@users.noreply.github.com> Co-authored-by: TechnoHouse <13776377+deephbz@users.noreply.github.com> Co-authored-by: Riddhimaan-Senapati <114703025+Riddhimaan-Senapati@users.noreply.github.com> Co-authored-by: MaFee921 <31881301+2284730142@users.noreply.github.com> Co-authored-by: te-chan <t-nakanome@sakura-is.co.jp> Co-authored-by: HQidea <HQidea@users.noreply.github.com> Co-authored-by: Joshbly <36315710+Joshbly@users.noreply.github.com> Co-authored-by: xhe <xw897002528@gmail.com> Co-authored-by: weiwenyan-dev <154779315+weiwenyan-dev@users.noreply.github.com> Co-authored-by: ex_wenyan.wei <ex_wenyan.wei@tcl.com> Co-authored-by: engchina <12236799+engchina@users.noreply.github.com> Co-authored-by: engchina <atjapan2015@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: 呆萌闷油瓶 <253605712@qq.com> Co-authored-by: Kemal <kemalmeler@outlook.com> Co-authored-by: Lazy_Frog <4590648+lazyFrogLOL@users.noreply.github.com> Co-authored-by: Yi Xiao <54782454+YIXIAO0@users.noreply.github.com> Co-authored-by: Steven sun <98230804+Tuyohai@users.noreply.github.com> Co-authored-by: steven <sunzwj@digitalchina.com> Co-authored-by: Kalo Chin <91766386+fdb02983rhy@users.noreply.github.com> Co-authored-by: Katy Tao <34019945+KatyTao@users.noreply.github.com> Co-authored-by: depy <42985524+h4ckdepy@users.noreply.github.com> Co-authored-by: 胡春东 <gycm520@gmail.com> Co-authored-by: Junjie.M <118170653@qq.com> Co-authored-by: MuYu <mr.muzea@gmail.com> Co-authored-by: Naoki Takashima <39912547+takatea@users.noreply.github.com> Co-authored-by: Summer-Gu <37869445+gubinjie@users.noreply.github.com> Co-authored-by: Fei He <droxer.he@gmail.com> Co-authored-by: ybalbert001 <120714773+ybalbert001@users.noreply.github.com> Co-authored-by: Yuanbo Li <ybalbert@amazon.com> Co-authored-by: douxc <7553076+douxc@users.noreply.github.com> Co-authored-by: liuzhenghua <1090179900@qq.com> Co-authored-by: Wu Jiayang <62842862+Wu-Jiayang@users.noreply.github.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: kimjion <45935338+kimjion@users.noreply.github.com> Co-authored-by: AugNSo <song.tiankai@icloud.com> Co-authored-by: llinvokerl <38915183+llinvokerl@users.noreply.github.com> Co-authored-by: liusurong.lsr <liusurong.lsr@alibaba-inc.com> Co-authored-by: Vasu Negi <vasu-negi@users.noreply.github.com> Co-authored-by: Hundredwz <1808096180@qq.com> Co-authored-by: Xiyuan Chen <52963600+GareArc@users.noreply.github.com>
преди 8 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
Introduce Plugins (#13836) Signed-off-by: yihong0618 <zouzou0208@gmail.com> Signed-off-by: -LAN- <laipz8200@outlook.com> Signed-off-by: xhe <xw897002528@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: takatost <takatost@gmail.com> Co-authored-by: kurokobo <kuro664@gmail.com> Co-authored-by: Novice Lee <novicelee@NoviPro.local> Co-authored-by: zxhlyh <jasonapring2015@outlook.com> Co-authored-by: AkaraChen <akarachen@outlook.com> Co-authored-by: Yi <yxiaoisme@gmail.com> Co-authored-by: Joel <iamjoel007@gmail.com> Co-authored-by: JzoNg <jzongcode@gmail.com> Co-authored-by: twwu <twwu@dify.ai> Co-authored-by: Hiroshi Fujita <fujita-h@users.noreply.github.com> Co-authored-by: AkaraChen <85140972+AkaraChen@users.noreply.github.com> Co-authored-by: NFish <douxc512@gmail.com> Co-authored-by: Wu Tianwei <30284043+WTW0313@users.noreply.github.com> Co-authored-by: 非法操作 <hjlarry@163.com> Co-authored-by: Novice <857526207@qq.com> Co-authored-by: Hiroki Nagai <82458324+nagaihiroki-git@users.noreply.github.com> Co-authored-by: Gen Sato <52241300+halogen22@users.noreply.github.com> Co-authored-by: eux <euxuuu@gmail.com> Co-authored-by: huangzhuo1949 <167434202+huangzhuo1949@users.noreply.github.com> Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com> Co-authored-by: lotsik <lotsik@mail.ru> Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com> Co-authored-by: nite-knite <nkCoding@gmail.com> Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: gakkiyomi <gakkiyomi@aliyun.com> Co-authored-by: CN-P5 <heibai2006@gmail.com> Co-authored-by: CN-P5 <heibai2006@qq.com> Co-authored-by: Chuehnone <1897025+chuehnone@users.noreply.github.com> Co-authored-by: yihong <zouzou0208@gmail.com> Co-authored-by: Kevin9703 <51311316+Kevin9703@users.noreply.github.com> Co-authored-by: -LAN- <laipz8200@outlook.com> Co-authored-by: Boris Feld <lothiraldan@gmail.com> Co-authored-by: mbo <himabo@gmail.com> Co-authored-by: mabo <mabo@aeyes.ai> Co-authored-by: Warren Chen <warren.chen830@gmail.com> Co-authored-by: JzoNgKVO <27049666+JzoNgKVO@users.noreply.github.com> Co-authored-by: jiandanfeng <chenjh3@wangsu.com> Co-authored-by: zhu-an <70234959+xhdd123321@users.noreply.github.com> Co-authored-by: zhaoqingyu.1075 <zhaoqingyu.1075@bytedance.com> Co-authored-by: 海狸大師 <86974027+yenslife@users.noreply.github.com> Co-authored-by: Xu Song <xusong.vip@gmail.com> Co-authored-by: rayshaw001 <396301947@163.com> Co-authored-by: Ding Jiatong <dingjiatong@gmail.com> Co-authored-by: Bowen Liang <liangbowen@gf.com.cn> Co-authored-by: JasonVV <jasonwangiii@outlook.com> Co-authored-by: le0zh <newlight@qq.com> Co-authored-by: zhuxinliang <zhuxinliang@didiglobal.com> Co-authored-by: k-zaku <zaku99@outlook.jp> Co-authored-by: luckylhb90 <luckylhb90@gmail.com> Co-authored-by: hobo.l <hobo.l@binance.com> Co-authored-by: jiangbo721 <365065261@qq.com> Co-authored-by: 刘江波 <jiangbo721@163.com> Co-authored-by: Shun Miyazawa <34241526+miya@users.noreply.github.com> Co-authored-by: EricPan <30651140+Egfly@users.noreply.github.com> Co-authored-by: crazywoola <427733928@qq.com> Co-authored-by: sino <sino2322@gmail.com> Co-authored-by: Jhvcc <37662342+Jhvcc@users.noreply.github.com> Co-authored-by: lowell <lowell.hu@zkteco.in> Co-authored-by: Boris Polonsky <BorisPolonsky@users.noreply.github.com> Co-authored-by: Ademílson Tonato <ademilsonft@outlook.com> Co-authored-by: Ademílson Tonato <ademilson.tonato@refurbed.com> Co-authored-by: IWAI, Masaharu <iwaim.sub@gmail.com> Co-authored-by: Yueh-Po Peng (Yabi) <94939112+y10ab1@users.noreply.github.com> Co-authored-by: Jason <ggbbddjm@gmail.com> Co-authored-by: Xin Zhang <sjhpzx@gmail.com> Co-authored-by: yjc980121 <3898524+yjc980121@users.noreply.github.com> Co-authored-by: heyszt <36215648+hieheihei@users.noreply.github.com> Co-authored-by: Abdullah AlOsaimi <osaimiacc@gmail.com> Co-authored-by: Abdullah AlOsaimi <189027247+osaimi@users.noreply.github.com> Co-authored-by: Yingchun Lai <laiyingchun@apache.org> Co-authored-by: Hash Brown <hi@xzd.me> Co-authored-by: zuodongxu <192560071+zuodongxu@users.noreply.github.com> Co-authored-by: Masashi Tomooka <tmokmss@users.noreply.github.com> Co-authored-by: aplio <ryo.091219@gmail.com> Co-authored-by: Obada Khalili <54270856+obadakhalili@users.noreply.github.com> Co-authored-by: Nam Vu <zuzoovn@gmail.com> Co-authored-by: Kei YAMAZAKI <1715090+kei-yamazaki@users.noreply.github.com> Co-authored-by: TechnoHouse <13776377+deephbz@users.noreply.github.com> Co-authored-by: Riddhimaan-Senapati <114703025+Riddhimaan-Senapati@users.noreply.github.com> Co-authored-by: MaFee921 <31881301+2284730142@users.noreply.github.com> Co-authored-by: te-chan <t-nakanome@sakura-is.co.jp> Co-authored-by: HQidea <HQidea@users.noreply.github.com> Co-authored-by: Joshbly <36315710+Joshbly@users.noreply.github.com> Co-authored-by: xhe <xw897002528@gmail.com> Co-authored-by: weiwenyan-dev <154779315+weiwenyan-dev@users.noreply.github.com> Co-authored-by: ex_wenyan.wei <ex_wenyan.wei@tcl.com> Co-authored-by: engchina <12236799+engchina@users.noreply.github.com> Co-authored-by: engchina <atjapan2015@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: 呆萌闷油瓶 <253605712@qq.com> Co-authored-by: Kemal <kemalmeler@outlook.com> Co-authored-by: Lazy_Frog <4590648+lazyFrogLOL@users.noreply.github.com> Co-authored-by: Yi Xiao <54782454+YIXIAO0@users.noreply.github.com> Co-authored-by: Steven sun <98230804+Tuyohai@users.noreply.github.com> Co-authored-by: steven <sunzwj@digitalchina.com> Co-authored-by: Kalo Chin <91766386+fdb02983rhy@users.noreply.github.com> Co-authored-by: Katy Tao <34019945+KatyTao@users.noreply.github.com> Co-authored-by: depy <42985524+h4ckdepy@users.noreply.github.com> Co-authored-by: 胡春东 <gycm520@gmail.com> Co-authored-by: Junjie.M <118170653@qq.com> Co-authored-by: MuYu <mr.muzea@gmail.com> Co-authored-by: Naoki Takashima <39912547+takatea@users.noreply.github.com> Co-authored-by: Summer-Gu <37869445+gubinjie@users.noreply.github.com> Co-authored-by: Fei He <droxer.he@gmail.com> Co-authored-by: ybalbert001 <120714773+ybalbert001@users.noreply.github.com> Co-authored-by: Yuanbo Li <ybalbert@amazon.com> Co-authored-by: douxc <7553076+douxc@users.noreply.github.com> Co-authored-by: liuzhenghua <1090179900@qq.com> Co-authored-by: Wu Jiayang <62842862+Wu-Jiayang@users.noreply.github.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: kimjion <45935338+kimjion@users.noreply.github.com> Co-authored-by: AugNSo <song.tiankai@icloud.com> Co-authored-by: llinvokerl <38915183+llinvokerl@users.noreply.github.com> Co-authored-by: liusurong.lsr <liusurong.lsr@alibaba-inc.com> Co-authored-by: Vasu Negi <vasu-negi@users.noreply.github.com> Co-authored-by: Hundredwz <1808096180@qq.com> Co-authored-by: Xiyuan Chen <52963600+GareArc@users.noreply.github.com>
преди 8 месеца
преди 10 месеца
преди 10 месеца
преди 10 месеца
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606
  1. import json
  2. import logging
  3. from json import JSONDecodeError
  4. from typing import Optional, Union
  5. from constants import HIDDEN_VALUE
  6. from core.entities.provider_configuration import ProviderConfiguration
  7. from core.helper import encrypter
  8. from core.helper.model_provider_cache import ProviderCredentialsCache, ProviderCredentialsCacheType
  9. from core.model_manager import LBModelManager
  10. from core.model_runtime.entities.model_entities import ModelType
  11. from core.model_runtime.entities.provider_entities import (
  12. ModelCredentialSchema,
  13. ProviderCredentialSchema,
  14. )
  15. from core.model_runtime.model_providers.model_provider_factory import ModelProviderFactory
  16. from core.provider_manager import ProviderManager
  17. from extensions.ext_database import db
  18. from libs.datetime_utils import naive_utc_now
  19. from models.provider import LoadBalancingModelConfig, ProviderCredential, ProviderModelCredential
  20. logger = logging.getLogger(__name__)
  21. class ModelLoadBalancingService:
  22. def __init__(self) -> None:
  23. self.provider_manager = ProviderManager()
  24. def enable_model_load_balancing(self, tenant_id: str, provider: str, model: str, model_type: str) -> None:
  25. """
  26. enable model load balancing.
  27. :param tenant_id: workspace id
  28. :param provider: provider name
  29. :param model: model name
  30. :param model_type: model type
  31. :return:
  32. """
  33. # Get all provider configurations of the current workspace
  34. provider_configurations = self.provider_manager.get_configurations(tenant_id)
  35. # Get provider configuration
  36. provider_configuration = provider_configurations.get(provider)
  37. if not provider_configuration:
  38. raise ValueError(f"Provider {provider} does not exist.")
  39. # Enable model load balancing
  40. provider_configuration.enable_model_load_balancing(model=model, model_type=ModelType.value_of(model_type))
  41. def disable_model_load_balancing(self, tenant_id: str, provider: str, model: str, model_type: str) -> None:
  42. """
  43. disable model load balancing.
  44. :param tenant_id: workspace id
  45. :param provider: provider name
  46. :param model: model name
  47. :param model_type: model type
  48. :return:
  49. """
  50. # Get all provider configurations of the current workspace
  51. provider_configurations = self.provider_manager.get_configurations(tenant_id)
  52. # Get provider configuration
  53. provider_configuration = provider_configurations.get(provider)
  54. if not provider_configuration:
  55. raise ValueError(f"Provider {provider} does not exist.")
  56. # disable model load balancing
  57. provider_configuration.disable_model_load_balancing(model=model, model_type=ModelType.value_of(model_type))
  58. def get_load_balancing_configs(
  59. self, tenant_id: str, provider: str, model: str, model_type: str
  60. ) -> tuple[bool, list[dict]]:
  61. """
  62. Get load balancing configurations.
  63. :param tenant_id: workspace id
  64. :param provider: provider name
  65. :param model: model name
  66. :param model_type: model type
  67. :return:
  68. """
  69. # Get all provider configurations of the current workspace
  70. provider_configurations = self.provider_manager.get_configurations(tenant_id)
  71. # Get provider configuration
  72. provider_configuration = provider_configurations.get(provider)
  73. if not provider_configuration:
  74. raise ValueError(f"Provider {provider} does not exist.")
  75. # Convert model type to ModelType
  76. model_type_enum = ModelType.value_of(model_type)
  77. # Get provider model setting
  78. provider_model_setting = provider_configuration.get_provider_model_setting(
  79. model_type=model_type_enum,
  80. model=model,
  81. )
  82. is_load_balancing_enabled = False
  83. if provider_model_setting and provider_model_setting.load_balancing_enabled:
  84. is_load_balancing_enabled = True
  85. # Get load balancing configurations
  86. load_balancing_configs = (
  87. db.session.query(LoadBalancingModelConfig)
  88. .where(
  89. LoadBalancingModelConfig.tenant_id == tenant_id,
  90. LoadBalancingModelConfig.provider_name == provider_configuration.provider.provider,
  91. LoadBalancingModelConfig.model_type == model_type_enum.to_origin_model_type(),
  92. LoadBalancingModelConfig.model_name == model,
  93. )
  94. .order_by(LoadBalancingModelConfig.created_at)
  95. .all()
  96. )
  97. if provider_configuration.custom_configuration.provider:
  98. # check if the inherit configuration exists,
  99. # inherit is represented for the provider or model custom credentials
  100. inherit_config_exists = False
  101. for load_balancing_config in load_balancing_configs:
  102. if load_balancing_config.name == "__inherit__":
  103. inherit_config_exists = True
  104. break
  105. if not inherit_config_exists:
  106. # Initialize the inherit configuration
  107. inherit_config = self._init_inherit_config(tenant_id, provider, model, model_type_enum)
  108. # prepend the inherit configuration
  109. load_balancing_configs.insert(0, inherit_config)
  110. else:
  111. # move the inherit configuration to the first
  112. for i, load_balancing_config in enumerate(load_balancing_configs[:]):
  113. if load_balancing_config.name == "__inherit__":
  114. inherit_config = load_balancing_configs.pop(i)
  115. load_balancing_configs.insert(0, inherit_config)
  116. # Get credential form schemas from model credential schema or provider credential schema
  117. credential_schemas = self._get_credential_schema(provider_configuration)
  118. # Get decoding rsa key and cipher for decrypting credentials
  119. decoding_rsa_key, decoding_cipher_rsa = encrypter.get_decrypt_decoding(tenant_id)
  120. # fetch status and ttl for each config
  121. datas = []
  122. for load_balancing_config in load_balancing_configs:
  123. in_cooldown, ttl = LBModelManager.get_config_in_cooldown_and_ttl(
  124. tenant_id=tenant_id,
  125. provider=provider,
  126. model=model,
  127. model_type=model_type_enum,
  128. config_id=load_balancing_config.id,
  129. )
  130. try:
  131. if load_balancing_config.encrypted_config:
  132. credentials = json.loads(load_balancing_config.encrypted_config)
  133. else:
  134. credentials = {}
  135. except JSONDecodeError:
  136. credentials = {}
  137. # Get provider credential secret variables
  138. credential_secret_variables = provider_configuration.extract_secret_variables(
  139. credential_schemas.credential_form_schemas
  140. )
  141. # decrypt credentials
  142. for variable in credential_secret_variables:
  143. if variable in credentials:
  144. try:
  145. credentials[variable] = encrypter.decrypt_token_with_decoding(
  146. credentials.get(variable), decoding_rsa_key, decoding_cipher_rsa
  147. )
  148. except ValueError:
  149. pass
  150. # Obfuscate credentials
  151. credentials = provider_configuration.obfuscated_credentials(
  152. credentials=credentials, credential_form_schemas=credential_schemas.credential_form_schemas
  153. )
  154. datas.append(
  155. {
  156. "id": load_balancing_config.id,
  157. "name": load_balancing_config.name,
  158. "credentials": credentials,
  159. "credential_id": load_balancing_config.credential_id,
  160. "enabled": load_balancing_config.enabled,
  161. "in_cooldown": in_cooldown,
  162. "ttl": ttl,
  163. }
  164. )
  165. return is_load_balancing_enabled, datas
  166. def get_load_balancing_config(
  167. self, tenant_id: str, provider: str, model: str, model_type: str, config_id: str
  168. ) -> Optional[dict]:
  169. """
  170. Get load balancing configuration.
  171. :param tenant_id: workspace id
  172. :param provider: provider name
  173. :param model: model name
  174. :param model_type: model type
  175. :param config_id: load balancing config id
  176. :return:
  177. """
  178. # Get all provider configurations of the current workspace
  179. provider_configurations = self.provider_manager.get_configurations(tenant_id)
  180. # Get provider configuration
  181. provider_configuration = provider_configurations.get(provider)
  182. if not provider_configuration:
  183. raise ValueError(f"Provider {provider} does not exist.")
  184. # Convert model type to ModelType
  185. model_type_enum = ModelType.value_of(model_type)
  186. # Get load balancing configurations
  187. load_balancing_model_config = (
  188. db.session.query(LoadBalancingModelConfig)
  189. .where(
  190. LoadBalancingModelConfig.tenant_id == tenant_id,
  191. LoadBalancingModelConfig.provider_name == provider_configuration.provider.provider,
  192. LoadBalancingModelConfig.model_type == model_type_enum.to_origin_model_type(),
  193. LoadBalancingModelConfig.model_name == model,
  194. LoadBalancingModelConfig.id == config_id,
  195. )
  196. .first()
  197. )
  198. if not load_balancing_model_config:
  199. return None
  200. try:
  201. if load_balancing_model_config.encrypted_config:
  202. credentials = json.loads(load_balancing_model_config.encrypted_config)
  203. else:
  204. credentials = {}
  205. except JSONDecodeError:
  206. credentials = {}
  207. # Get credential form schemas from model credential schema or provider credential schema
  208. credential_schemas = self._get_credential_schema(provider_configuration)
  209. # Obfuscate credentials
  210. credentials = provider_configuration.obfuscated_credentials(
  211. credentials=credentials, credential_form_schemas=credential_schemas.credential_form_schemas
  212. )
  213. return {
  214. "id": load_balancing_model_config.id,
  215. "name": load_balancing_model_config.name,
  216. "credentials": credentials,
  217. "enabled": load_balancing_model_config.enabled,
  218. }
  219. def _init_inherit_config(
  220. self, tenant_id: str, provider: str, model: str, model_type: ModelType
  221. ) -> LoadBalancingModelConfig:
  222. """
  223. Initialize the inherit configuration.
  224. :param tenant_id: workspace id
  225. :param provider: provider name
  226. :param model: model name
  227. :param model_type: model type
  228. :return:
  229. """
  230. # Initialize the inherit configuration
  231. inherit_config = LoadBalancingModelConfig(
  232. tenant_id=tenant_id,
  233. provider_name=provider,
  234. model_type=model_type.to_origin_model_type(),
  235. model_name=model,
  236. name="__inherit__",
  237. )
  238. db.session.add(inherit_config)
  239. db.session.commit()
  240. return inherit_config
  241. def update_load_balancing_configs(
  242. self, tenant_id: str, provider: str, model: str, model_type: str, configs: list[dict], config_from: str
  243. ) -> None:
  244. """
  245. Update load balancing configurations.
  246. :param tenant_id: workspace id
  247. :param provider: provider name
  248. :param model: model name
  249. :param model_type: model type
  250. :param configs: load balancing configs
  251. :param config_from: predefined-model or custom-model
  252. :return:
  253. """
  254. # Get all provider configurations of the current workspace
  255. provider_configurations = self.provider_manager.get_configurations(tenant_id)
  256. # Get provider configuration
  257. provider_configuration = provider_configurations.get(provider)
  258. if not provider_configuration:
  259. raise ValueError(f"Provider {provider} does not exist.")
  260. # Convert model type to ModelType
  261. model_type_enum = ModelType.value_of(model_type)
  262. if not isinstance(configs, list):
  263. raise ValueError("Invalid load balancing configs")
  264. current_load_balancing_configs = (
  265. db.session.query(LoadBalancingModelConfig)
  266. .where(
  267. LoadBalancingModelConfig.tenant_id == tenant_id,
  268. LoadBalancingModelConfig.provider_name == provider_configuration.provider.provider,
  269. LoadBalancingModelConfig.model_type == model_type_enum.to_origin_model_type(),
  270. LoadBalancingModelConfig.model_name == model,
  271. )
  272. .all()
  273. )
  274. # id as key, config as value
  275. current_load_balancing_configs_dict = {config.id: config for config in current_load_balancing_configs}
  276. updated_config_ids = set()
  277. for config in configs:
  278. if not isinstance(config, dict):
  279. raise ValueError("Invalid load balancing config")
  280. config_id = config.get("id")
  281. name = config.get("name")
  282. credentials = config.get("credentials")
  283. credential_id = config.get("credential_id")
  284. enabled = config.get("enabled")
  285. if credential_id:
  286. credential_record: ProviderCredential | ProviderModelCredential | None = None
  287. if config_from == "predefined-model":
  288. credential_record = (
  289. db.session.query(ProviderCredential)
  290. .filter_by(
  291. id=credential_id,
  292. tenant_id=tenant_id,
  293. provider_name=provider_configuration.provider.provider,
  294. )
  295. .first()
  296. )
  297. else:
  298. credential_record = (
  299. db.session.query(ProviderModelCredential)
  300. .filter_by(
  301. id=credential_id,
  302. tenant_id=tenant_id,
  303. provider_name=provider_configuration.provider.provider,
  304. model_name=model,
  305. model_type=model_type_enum.to_origin_model_type(),
  306. )
  307. .first()
  308. )
  309. if not credential_record:
  310. raise ValueError(f"Provider credential with id {credential_id} not found")
  311. name = credential_record.credential_name
  312. if not name:
  313. raise ValueError("Invalid load balancing config name")
  314. if enabled is None:
  315. raise ValueError("Invalid load balancing config enabled")
  316. # is config exists
  317. if config_id:
  318. config_id = str(config_id)
  319. if config_id not in current_load_balancing_configs_dict:
  320. raise ValueError(f"Invalid load balancing config id: {config_id}")
  321. updated_config_ids.add(config_id)
  322. load_balancing_config = current_load_balancing_configs_dict[config_id]
  323. if credentials:
  324. if not isinstance(credentials, dict):
  325. raise ValueError("Invalid load balancing config credentials")
  326. # validate custom provider config
  327. credentials = self._custom_credentials_validate(
  328. tenant_id=tenant_id,
  329. provider_configuration=provider_configuration,
  330. model_type=model_type_enum,
  331. model=model,
  332. credentials=credentials,
  333. load_balancing_model_config=load_balancing_config,
  334. validate=False,
  335. )
  336. # update load balancing config
  337. load_balancing_config.encrypted_config = json.dumps(credentials)
  338. load_balancing_config.name = name
  339. load_balancing_config.enabled = enabled
  340. load_balancing_config.updated_at = naive_utc_now()
  341. db.session.commit()
  342. self._clear_credentials_cache(tenant_id, config_id)
  343. else:
  344. # create load balancing config
  345. if name in {"__inherit__", "__delete__"}:
  346. raise ValueError("Invalid load balancing config name")
  347. if credential_id:
  348. credential_source = "provider" if config_from == "predefined-model" else "custom_model"
  349. assert credential_record is not None
  350. load_balancing_model_config = LoadBalancingModelConfig(
  351. tenant_id=tenant_id,
  352. provider_name=provider_configuration.provider.provider,
  353. model_type=model_type_enum.to_origin_model_type(),
  354. model_name=model,
  355. name=credential_record.credential_name,
  356. encrypted_config=credential_record.encrypted_config,
  357. credential_id=credential_id,
  358. credential_source_type=credential_source,
  359. )
  360. else:
  361. if not credentials:
  362. raise ValueError("Invalid load balancing config credentials")
  363. if not isinstance(credentials, dict):
  364. raise ValueError("Invalid load balancing config credentials")
  365. # validate custom provider config
  366. credentials = self._custom_credentials_validate(
  367. tenant_id=tenant_id,
  368. provider_configuration=provider_configuration,
  369. model_type=model_type_enum,
  370. model=model,
  371. credentials=credentials,
  372. validate=False,
  373. )
  374. # create load balancing config
  375. load_balancing_model_config = LoadBalancingModelConfig(
  376. tenant_id=tenant_id,
  377. provider_name=provider_configuration.provider.provider,
  378. model_type=model_type_enum.to_origin_model_type(),
  379. model_name=model,
  380. name=name,
  381. encrypted_config=json.dumps(credentials),
  382. )
  383. db.session.add(load_balancing_model_config)
  384. db.session.commit()
  385. # get deleted config ids
  386. deleted_config_ids = set(current_load_balancing_configs_dict.keys()) - updated_config_ids
  387. for config_id in deleted_config_ids:
  388. db.session.delete(current_load_balancing_configs_dict[config_id])
  389. db.session.commit()
  390. self._clear_credentials_cache(tenant_id, config_id)
  391. def validate_load_balancing_credentials(
  392. self,
  393. tenant_id: str,
  394. provider: str,
  395. model: str,
  396. model_type: str,
  397. credentials: dict,
  398. config_id: Optional[str] = None,
  399. ) -> None:
  400. """
  401. Validate load balancing credentials.
  402. :param tenant_id: workspace id
  403. :param provider: provider name
  404. :param model_type: model type
  405. :param model: model name
  406. :param credentials: credentials
  407. :param config_id: load balancing config id
  408. :return:
  409. """
  410. # Get all provider configurations of the current workspace
  411. provider_configurations = self.provider_manager.get_configurations(tenant_id)
  412. # Get provider configuration
  413. provider_configuration = provider_configurations.get(provider)
  414. if not provider_configuration:
  415. raise ValueError(f"Provider {provider} does not exist.")
  416. # Convert model type to ModelType
  417. model_type_enum = ModelType.value_of(model_type)
  418. load_balancing_model_config = None
  419. if config_id:
  420. # Get load balancing config
  421. load_balancing_model_config = (
  422. db.session.query(LoadBalancingModelConfig)
  423. .where(
  424. LoadBalancingModelConfig.tenant_id == tenant_id,
  425. LoadBalancingModelConfig.provider_name == provider,
  426. LoadBalancingModelConfig.model_type == model_type_enum.to_origin_model_type(),
  427. LoadBalancingModelConfig.model_name == model,
  428. LoadBalancingModelConfig.id == config_id,
  429. )
  430. .first()
  431. )
  432. if not load_balancing_model_config:
  433. raise ValueError(f"Load balancing config {config_id} does not exist.")
  434. # Validate custom provider config
  435. self._custom_credentials_validate(
  436. tenant_id=tenant_id,
  437. provider_configuration=provider_configuration,
  438. model_type=model_type_enum,
  439. model=model,
  440. credentials=credentials,
  441. load_balancing_model_config=load_balancing_model_config,
  442. )
  443. def _custom_credentials_validate(
  444. self,
  445. tenant_id: str,
  446. provider_configuration: ProviderConfiguration,
  447. model_type: ModelType,
  448. model: str,
  449. credentials: dict,
  450. load_balancing_model_config: Optional[LoadBalancingModelConfig] = None,
  451. validate: bool = True,
  452. ) -> dict:
  453. """
  454. Validate custom credentials.
  455. :param tenant_id: workspace id
  456. :param provider_configuration: provider configuration
  457. :param model_type: model type
  458. :param model: model name
  459. :param credentials: credentials
  460. :param load_balancing_model_config: load balancing model config
  461. :param validate: validate credentials
  462. :return:
  463. """
  464. # Get credential form schemas from model credential schema or provider credential schema
  465. credential_schemas = self._get_credential_schema(provider_configuration)
  466. # Get provider credential secret variables
  467. provider_credential_secret_variables = provider_configuration.extract_secret_variables(
  468. credential_schemas.credential_form_schemas
  469. )
  470. if load_balancing_model_config:
  471. try:
  472. # fix origin data
  473. if load_balancing_model_config.encrypted_config:
  474. original_credentials = json.loads(load_balancing_model_config.encrypted_config)
  475. else:
  476. original_credentials = {}
  477. except JSONDecodeError:
  478. original_credentials = {}
  479. # encrypt credentials
  480. for key, value in credentials.items():
  481. if key in provider_credential_secret_variables:
  482. # if send [__HIDDEN__] in secret input, it will be same as original value
  483. if value == HIDDEN_VALUE and key in original_credentials:
  484. credentials[key] = encrypter.decrypt_token(tenant_id, original_credentials[key])
  485. if validate:
  486. model_provider_factory = ModelProviderFactory(tenant_id)
  487. if isinstance(credential_schemas, ModelCredentialSchema):
  488. credentials = model_provider_factory.model_credentials_validate(
  489. provider=provider_configuration.provider.provider,
  490. model_type=model_type,
  491. model=model,
  492. credentials=credentials,
  493. )
  494. else:
  495. credentials = model_provider_factory.provider_credentials_validate(
  496. provider=provider_configuration.provider.provider, credentials=credentials
  497. )
  498. for key, value in credentials.items():
  499. if key in provider_credential_secret_variables:
  500. credentials[key] = encrypter.encrypt_token(tenant_id, value)
  501. return credentials
  502. def _get_credential_schema(
  503. self, provider_configuration: ProviderConfiguration
  504. ) -> Union[ModelCredentialSchema, ProviderCredentialSchema]:
  505. """Get form schemas."""
  506. if provider_configuration.provider.model_credential_schema:
  507. return provider_configuration.provider.model_credential_schema
  508. elif provider_configuration.provider.provider_credential_schema:
  509. return provider_configuration.provider.provider_credential_schema
  510. else:
  511. raise ValueError("No credential schema found")
  512. def _clear_credentials_cache(self, tenant_id: str, config_id: str) -> None:
  513. """
  514. Clear credentials cache.
  515. :param tenant_id: workspace id
  516. :param config_id: load balancing config id
  517. :return:
  518. """
  519. provider_model_credentials_cache = ProviderCredentialsCache(
  520. tenant_id=tenant_id, identity_id=config_id, cache_type=ProviderCredentialsCacheType.LOAD_BALANCING_MODEL
  521. )
  522. provider_model_credentials_cache.delete()