Skip to content

Latest commit

 

History

History
66 lines (53 loc) · 2.99 KB

ADR-007-use-of-external-llms.md

File metadata and controls

66 lines (53 loc) · 2.99 KB

ADR-007: Use of external LLMs

Date:

2024-09-26

Status:

Accepted

Context:

Our interview with the AI expert led to new insight concerning LLMs, and some important assumptions A24, A25, A21, A20

The following AI-use-cases need to be supported:

We discussed three propositions:

  • A: Use external AI (LLM) services as offered for example by Azure, Google, OpenAI.
    • +: Little development cost
    • +: Easy adaptation to the latest trends and models.
    • +: Tendency to become cheaper
    • +: Very fast time to market
    • -: Expensive.
    • -: Many API limitations such as rate-limits, delays, quotas, short lifecycle of models
    • -: Regulatory considerations due to data privacy need to be made
  • B: Use self controlled open-source LLM in a container running on the cloud
    • +: Specialized hardware can be used as a service this way.
    • +: A lot of control over fine-tuning, rates, and the runtime.
    • +: Less regulatory considerations due to data privacy need to be made.
    • -: Current LLMs will be surpassed soon due to fast evolving market. (A20)
    • -: Requires more development, configuration, maintenance, know-how, therefore higher cost.
    • -: High infrastructure cost due to resource heavy computing
  • C: Use self controlled open-source LLM on-prem
    • +: Data privacy
    • +: Full control over LLM
    • -: Specialized hardware needs to be procured and additional staff needed. Delay in project timeline.
    • -: High electricity cost
    • -: High procurement and additional staff cost

Decision

  • Proposition A: We use external LLMs
  • We need additional fault-tolerance and interoperability: we use an event-driven design for every LLM client in the system.

Consequences:

  • AI development can start without delay, faster TTM.
  • Need to somehow track rate to avoid surpassing rate-limit (currently not solved, part of our Known Limitations)
  • Cost is more predictable.

Strengthened characteristics:

  • Cost (especially due to tendency to become cheaper)
  • Feasibility (less hardware, staff, talent required. Ability to use latest LLMs)
  • Interoperability (due to the chosen design)
  • Fault-tolerance (due to the chosen design)
  • Availability (cloud providers SLA)

Weakened characteristics:

  • Feasibility (Need to check privacy regulations)
  • Responsiveness (User-facing AIs)
  • Observability (Fewer metrics about external system)
  • Deployability (Due to the risk of LLMs becoming EOL (end-of-life). High effort of testing and migrating to new LLMs)