2024-09-26
Accepted
Our interview with the AI expert led to new insight concerning LLMs, and some important assumptions A24, A25, A21, A20
The following AI-use-cases need to be supported:
- A: Resume-Tips A24, A25, A21, A20
- B: Automatic filling of employee data during registration A24, A25, A21, A20
- C: Transformation of resume to story A24, A25, A21, A20
- D: Extraction of features from stories and open role A24, A25, A21, A20
We discussed three propositions:
- A: Use external AI (LLM) services as offered for example by Azure, Google, OpenAI.
- +: Little development cost
- +: Easy adaptation to the latest trends and models.
- +: Tendency to become cheaper
- +: Very fast time to market
- -: Expensive.
- -: Many API limitations such as rate-limits, delays, quotas, short lifecycle of models
- -: Regulatory considerations due to data privacy need to be made
- B: Use self controlled open-source LLM in a container running on the cloud
- +: Specialized hardware can be used as a service this way.
- +: A lot of control over fine-tuning, rates, and the runtime.
- +: Less regulatory considerations due to data privacy need to be made.
- -: Current LLMs will be surpassed soon due to fast evolving market. (A20)
- -: Requires more development, configuration, maintenance, know-how, therefore higher cost.
- -: High infrastructure cost due to resource heavy computing
- C: Use self controlled open-source LLM on-prem
- +: Data privacy
- +: Full control over LLM
- -: Specialized hardware needs to be procured and additional staff needed. Delay in project timeline.
- -: High electricity cost
- -: High procurement and additional staff cost
- Proposition A: We use external LLMs
- We need additional fault-tolerance and interoperability: we use an event-driven design for every LLM client in the system.
- AI development can start without delay, faster TTM.
- Need to somehow track rate to avoid surpassing rate-limit (currently not solved, part of our Known Limitations)
- Cost is more predictable.
- Cost (especially due to tendency to become cheaper)
- Feasibility (less hardware, staff, talent required. Ability to use latest LLMs)
- Interoperability (due to the chosen design)
- Fault-tolerance (due to the chosen design)
- Availability (cloud providers SLA)
- Feasibility (Need to check privacy regulations)
- Responsiveness (User-facing AIs)
- Observability (Fewer metrics about external system)
- Deployability (Due to the risk of LLMs becoming EOL (end-of-life). High effort of testing and migrating to new LLMs)