Skip to content
This repository has been archived by the owner on Mar 14, 2022. It is now read-only.

TACO Technology Selection Info Page

Christina Harlow edited this page Apr 24, 2018 · 4 revisions

A single place for information shared on the TACO Technology Selection Review Committee. This is taken from the existing work in this wiki as well as in our working Google Drive documents, available at https://bit.yl/DLSSgdrive.

See our post-work cycle technology selections review notes here: https://github.com/sul-dlss-labs/taco/wiki/TACO-Technology-Selection-Post-WC-Review

Info about the project & prototype

Look at our 2 pager or our TACO Prototype Info Doc (written by the SDR3 Design Working Group in Phase 1).

Note that this is a PROTOTYPE within the SDR3 Design Process, a process that already has feedback sessions, larger group review, + transparency built in. This prototype work cycle is part of the SDR3 Roadmap and is well cushioned to avoid DLSS Direct's tendency for past projects to push prototype work directly into production.

What criteria did we use

  • Goals
    • API-centric
    • Support for Swagger API & Data MAP spec-driven technology
    • Smaller, less complex components
    • Improved performance over SDR2 (i.e.scalability)
    • High availability
    • Cloud deployability
    • Ease of setup/maintenance
    • Easy for developers to learn/understand
  • Cloud first but Cloud neutral, i.e., all technologies can “gracefully degrade” to other Cloud solutions or local solutions:
    • Deployment -- Docker => Docker is reusable due to its nature / existence
    • AWS ECS (elastic container service) => Any system or VM that can run docker containers
    • Swagger 2.0 => same in either place
    • Go + go-swagger framework => there are relevant libraries in other languages as well, as the important thing is the Swagger 2.0 specification
    • AWS dynamodb => CouchDB or even Postgres
    • AWS s3 => File system
    • Functional Reactive Programming (FRP) backend for processing => same
    • AWS kinesis => Kafka
  • Ease of integration with the rest of the systems

What technologies have we selected

  • Deployment -- Docker
  • AWS ECS (elastic container service)
  • Swagger 2.0 (REST API Specification)
  • Go (golang) + go-swagger framework
    • Working design meeting notes can be found on our GitHub repo’s wiki.
    • Summary:
    • Overarching guiding principle for deployment is serverless, cloud focused but also cloud neutral. This greatly informed the decision to use docker containers as the deployment architecture (for API code)
    • Efficient docker container deployment can best be achieved with small, executable binaries (as opposed to platforms that require an operating system and server.
    • Focusing on a compilable language, the first choice for small, efficient API containers is the Go language.
    • As this is a prototype work cycle, our focus is also to enable a polylingual solution - also achieved through docker containers.
    • Additional API goals of rapid development and delivery, swagger API specification, continuous deployment, and parallelization are all well supported in Go.
  • AWS dynamodb
  • AWS s3
  • Functional Reactive Programming (FRP) backend for processing
  • AWS kinesis

What other options were considered (& why were they dismissed?)

  • Swagger 2.0
    • Hand crafted documentation of an api -- harder to keep the documentation in sync. The documentation is never a priority.
    • Rails application where the specification is in the controllers / code instead of a separate specification. See in preservation work that this is not ideal either.
  • Go + go-swagger framework -- generated from the API specifications
    • Ruby + Rails -- Go can be an order of magnitude faster than RoR. RoR is more difficult to deploy (requires a host os and has runtime dependencies on packages). RoR is not supported by AWS.
      • Ruby on Rails is more suited for frontend work. Ruby on Rails is very complex because it’s main use is for creating HTML+Javascript applications, whereas SDR3 will only be an API..
    • Java + swagger -- Java is not a friendly language to developers. Requires an IDE. Requires a lot of overhead per unit of work.
      • Java much less popular language in our domain, and trend worth tracking
      • Go more similar to Ruby than Java
    • Elixir Phoenix framework -- Functional programming might be hard to learn. Elixir/Erlang not supported by AWS.
    • Lambda -- Too coupled to AWS with potentially significant and difficult to predict costs (required API Gateway calls are charged in addition to the Lamba/bandwidth/storage/etc).
  • Note on Fedora 4 API vs TACO API
    • Mostly focusing on utility, decoupling with Linked Data Platform (i.e. the publication of linked data) at this level of our stack
    • The TACO API aims to be much simpler than the Fedora API, making it easier to consume.
    • Reduced API calls (up to %50 if including ACLs, FileSets & ORE proxy ordering) leading to increased performance.
    • Potential community benefit by producing/demonstrating a modular strategy for the Fedora API instead of a single, non-scalable server/application?
    • This is especially true with our Processing Framework nodes, that will interact with Kafka / Kinesis but probably be written in Ruby due to library considerations.
  • AWS dynamodb
    • Cassandra -- more difficult to setup/manage
    • RDS/PostgreSQL -- anecdotally more downtime, higher cost
    • Redis -- more difficult to setup/manage. Higher cost to operate
    • Riak -- more difficult to setup/manage. Higher cost to operate
    • MongoDB -- more difficult to setup/manage. Higher cost to operate
    • Couch DB -- more difficult to setup/manage. Higher cost to operate
    • Fedora 4 -- Too slow when using complex structural metadata (as SUL has). Doesn’t scale (can’t be distributed). Coupled to a particular data model. LDP is complex. Higher cost to operate.
  • AWS s3
    • EBS -- more difficult to setup/manage. Higher cost.
  • AWS kinesis
    • Kafka -- higher cost
    • Faktory -- not mature
    • RabbitMQ -- requires more maintenance, higher cost.
    • ActiveMQ -- high complexity to provide distribution, more setup/maintenance, higher costs.
  • Processing
    • Job queues -- prescriptive. More difficult to change. Dependency is controlled in the core rather than utilizing an API
  • Containers/deployment
    • Kubernetes -- more complicated, we’re not as familiar with this, and history of our teams in DLSS using AWS (i.e. Preservation, Hyku / Hybox, DLME, Terraform work to support this)
    • ElasticBeanstalk -- Tightly coupled to AWS and custom configuration management is complex and unsustainable (i.e. NGINX libraries for Shibboleth).
    • Serverless / AWS SAM Local -- orchestration issues.

What effects do we see having this on the rest of DLSS?

  • Increased flexibility. By using an API centric design philosophy, components can be replaced piecemeal rather than requiring epic migrations (e.g. dor-services).
  • Selecting Go as the implementation language for the Management API will not require any other component in DLSS to use Go. The factor that will make many components need to be rewritten is the decision to move off of Fedora 3.
  • Developers are able to use skills beyond Ruby -- we’ve hired a number of DLSS developers whose first level skills are not Ruby (as the language, naturally, is not a great fit for project or needs). Being intelligently polylingual in the department allows our developers to be better at using their own skills and encourages learning of new technologies (like AI). Positions DLSS for better cloud deployment options where Ruby support is often lacking.
  • APIs allow new consumers to be created organically, with goals we may not yet have considered such as analysis using machine learning or artificial intelligence
  • Moving to the cloud. Greater focus on managing services and less focus on managing machines.
  • Improved productivity. Our “customers” will see their objects move through the system faster.
  • This is not final list, but only for this work cycle. It is part of the SDR3 Design which has built in reviews + feedback mechanisms.