diff --git a/.circleci/config.yml b/.circleci/config.yml index e8886cd9b5..23bd8011e4 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -87,7 +87,7 @@ jobs: command: | mkdir -p .circleci && cd .circleci fetched=false - for i in $(seq 1 6); do + for i in $(seq 1 5); do echo "" res=$(curl -fsS https://api.github.com/repos/arangodb/docs-hugo/contents/.circleci?ref=$CIRCLE_SHA1) || curlStatus=$? if [[ -z "${curlStatus:-}" ]]; then @@ -103,7 +103,7 @@ jobs: fi unset curlStatus unset jqStatus - sleep 10 + sleep 60 done if [[ "$fetched" = false ]]; then echo "Failed to fetch download URLs" diff --git a/.circleci/generate_config.py b/.circleci/generate_config.py index 4d12e8194d..8548703ae6 100644 --- a/.circleci/generate_config.py +++ b/.circleci/generate_config.py @@ -18,7 +18,7 @@ ## Load versions versions = yaml.safe_load(open("versions.yaml", "r")) -versions = sorted(versions, key=lambda d: d['name']) +versions = sorted(versions["/arangodb/"], key=lambda d: d['name']) print(f"Loaded versions {versions}") diff --git a/README.md b/README.md index 0466fbd353..530c4c1c7e 100644 --- a/README.md +++ b/README.md @@ -367,8 +367,8 @@ Inner shortcode Tags let you display badges, usually below a headline. This is mainly used for pointing out if a feature is only available in the -ArangoDB Platform, the ArangoGraph Insights Platform, or both. -See [Environment remarks](#environment-remarks) for details. +GenAI Suite, the Data Platform, the Arango Managed Platform (AMP), or multiple +of them. See [Environment remarks](#environment-remarks) for details. It is also used for [Edition remarks](#edition-remarks) in content before version 3.12.5. @@ -570,7 +570,7 @@ The following shortcodes also exist but are rarely used: - _DB-Server_, not ~~dbserver~~, ~~db-server~~, ~~DBserver~~ (unless it is a code value) - _Coordinator_ (uppercase C) - _Agent_, _Agency_ (uppercase A) - - _ArangoGraph Insights Platform_ and _ArangoGraph_ for short, but not + - _Arango Managed Platform (AMP)_ and _ArangoGraph_ for short, but not ~~Oasis~~, ~~ArangoDB Oasis~~, or ~~ArangoDB Cloud~~ - _Deployment mode_ (single server, cluster, etc.), not ~~deployment type~~ @@ -586,7 +586,7 @@ For external links, use standard Markdown. Clicking these links automatically opens them in a new tab: ```markdown -[ArangoGraph Insights Platform](https://dashboard.arangodb.cloud) +[Arango Managed Platform (AMP)](https://dashboard.arangodb.cloud) ``` For internal links, use relative paths to the Markdown files. Always link to @@ -674,25 +674,24 @@ deprecated features in the same manner with `Deprecated in: ...`. ### Environment remarks Pages and sections about features that are only available in certain environments -such as the ArangoDB Platform, the ArangoGraph Insight Platform, or the -ArangoDB Shell should indicate where they are available using the `tag` shortcode. +such as in ArangoDB Shell should indicate where they are available using the +`tag` shortcode. -In the unified Platform and ArangoGraph but not in the Core: +Features exclusive to the Data Platform, GenAI Data Platform, +Arango Managed Platform (AMP), and ArangoDB generally don't need to be tagged +because they are in dedicated parts of the documentation. However, if there are +subsections with different procedures, each can be tagged accordingly. -```markdown -{{< tag "ArangoDB Platform" "ArangoGraph" >}} -``` - -In the unified Platform only: +In the GenAI Data Platform only: ```markdown -{{< tag "ArangoDB Platform" >}} +{{< tag "GenAI Data Platform" >}} ``` -In ArangoGraph only: +In the Arango Managed Platform only: ```markdown -{{< tag "ArangoGraph" >}} +{{< tag "AMP" >}} ``` In the ArangoDB Shell but not the server-side JavaScript API: @@ -719,7 +718,7 @@ Enterprise Edition features should indicate that the Enterprise Edition is required using a tag. Use the following include in the general case: ```markdown -{{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} +{{< tag "ArangoDB Enterprise Edition" "AMP" >}} ``` ### Add lead paragraphs diff --git a/site/config/_default/config.yaml b/site/config/_default/config.yaml index 67c2321b61..25e4930333 100644 --- a/site/config/_default/config.yaml +++ b/site/config/_default/config.yaml @@ -21,12 +21,18 @@ module: # Version folders can be ignored temporarily for faster local builds # of a single version (here: 3.12) -# - excludeFiles: -# - 3.10/* -# - 3.11/* -# - 3.13/* -# source: content -# target: content + - source: content + target: content + excludeFiles: +# - arangodb/3.10/* +# - arangodb/3.11/* +# - arangodb/3.13/* + + - source: content/arangodb/3.12 + target: content/arangodb/stable + + - source: content/arangodb/3.13 + target: content/arangodb/devel markup: highlight: diff --git a/site/content/3.10/_index.md b/site/content/3.10/_index.md deleted file mode 100644 index dee4818818..0000000000 --- a/site/content/3.10/_index.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: Recommended Resources -menuTitle: '3.10' -weight: 0 -layout: default ---- -{{< cloudbanner >}} - -{{< cards >}} - -{{% card title="What is ArangoDB?" link="about-arangodb/" %}} -Get to know graphs, ArangoDB's use cases and features. -{{% /card %}} - -{{% card title="Get started" link="get-started/" %}} -Learn about ArangoDB's core concepts, how to interact with the database system, -and get a server instance up and running. -{{% /card %}} - -{{% card title="ArangoGraph Insights Platform" link="arangograph/" %}} -Try out ArangoDB's fully-managed cloud offering for a faster time to value. -{{% /card %}} - -{{% card title="AQL" link="aql/" %}} -ArangoDB's Query Language AQL lets you use graphs, JSON documents, and search -via a single, composable query language. -{{% /card %}} - -{{% card title="Data Science" link="data-science/" %}} -Discover the graph analytics and machine learning features of ArangoDB. -{{% /card %}} - -{{% card title="Deploy" link="deploy/" %}} -Find the right deployment mode and set up your ArangoDB instance. -{{% /card %}} - -{{% card title="Develop" link="develop/" %}} -See the in-depth feature and API documentation to start developing applications -with ArangoDB as your backend. -{{% /card %}} - -{{< /cards >}} diff --git a/site/content/3.10/about-arangodb/_index.md b/site/content/3.10/about-arangodb/_index.md deleted file mode 100644 index 9b96a70c37..0000000000 --- a/site/content/3.10/about-arangodb/_index.md +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: What is ArangoDB? -menuTitle: About ArangoDB -weight: 5 -description: >- - ArangoDB is a scalable graph database system to drive value from connected - data, faster -aliases: - - introduction - - introduction/about-arangodb ---- -![ArangoDB Overview Diagram](../../images/arangodb-overview-diagram.png) - -ArangoDB combines the analytical power of native graphs with an integrated -search engine, JSON support, and a variety of data access patterns via a single, -composable query language. - -ArangoDB is available in an open-source and a commercial [edition](features/_index.md). -You can use it for on-premises deployments, as well as a fully managed -cloud service, the [ArangoGraph Insights Platform](../arangograph/_index.md). - -## What are Graphs? - -Graphs are information networks comprised of nodes and relations. - -![Node - Relation - Node](../../images/data-model-graph-relation-abstract.png) - -A social network is a common example of a graph. People are represented by nodes -and their friendships by relations. - -![Mary - is friend of - John](../../images/data-model-graph-relation-concrete.png) - -Nodes are also called vertices (singular: vertex), and relations are edges that -connect vertices. -A vertex typically represents a specific entity (a person, a book, a sensor -reading, etc.) and an edge defines how one entity relates to another. - -![Mary - bought - Book, is friend of - John](../../images/data-model-graph-relations.png) - -This paradigm of storing data feels natural because it closely matches the -cognitive model of humans. It is an expressive data model that allows you to -represent many problem domains and solve them with semantic queries and graph -analytics. - -## Beyond Graphs - -Not everything is a graph use case. ArangoDB lets you equally work with -structured, semi-structured, and unstructured data in the form of schema-free -JSON objects, without having to connect these objects to form a graph. - -![Person Mary, Book ArangoDB](../../images/data-model-document.png) - - - -Depending on your needs, you may mix graphs and unconnected data. -ArangoDB is designed from the ground up to support multiple data models with a -single, composable query language. - -```aql -FOR book IN Books - FILTER book.title == "ArangoDB" - FOR person IN 2..2 INBOUND book Sales, OUTBOUND People - RETURN person.name -``` - -ArangoDB also comes with an integrated search engine for information retrieval, -such as full-text search with relevance ranking. - -ArangoDB is written in C++ for high performance and built to work at scale, in -the cloud or on-premises. - - diff --git a/site/content/3.10/about-arangodb/features/_index.md b/site/content/3.10/about-arangodb/features/_index.md deleted file mode 100644 index d33b990636..0000000000 --- a/site/content/3.10/about-arangodb/features/_index.md +++ /dev/null @@ -1,127 +0,0 @@ ---- -title: Features and Capabilities -menuTitle: Features -weight: 20 -description: >- - ArangoDB is a graph database with a powerful set of features for data management and analytics, - supported by a rich ecosystem of integrations and drivers -aliases: - - ../introduction/features ---- -## On-premises versus Cloud - -### Fully managed cloud service - -The fully managed multi-cloud -[ArangoGraph Insights Platform](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic) -is the easiest and fastest way to get started. It runs the Enterprise Edition -of ArangoDB, lets you deploy clusters with just a few clicks, and is operated -by a dedicated team of ArangoDB engineers day and night. You can choose from a -variety of support plans to meet your needs. - -- Supports many cloud deployment regions across the main cloud providers - (AWS, Azure, GCP) -- High availability featuring multi-region zone clusters, managed backups, - and zero-downtime upgrades -- Integrated monitoring, alerting, and log management -- Highly secure with encryption at transit and at rest -- Includes elastic scalability for all deployment models (OneShard and Sharded clusters) - -To learn more, go to the [ArangoGraph documentation](../../arangograph/_index.md). - -### Self-managed in the cloud - -ArangoDB can be self-deployed on AWS or other cloud platforms, too. However, when -using a self-managed deployment, you take full control of managing the resources -needed to run it in the cloud. This involves tasks such as configuring, -provisioning, and monitoring the system. For more details, see -[self-deploying ArangoDB in the cloud](../../deploy/in-the-cloud.md). - -ArangoDB supports Kubernetes through its official -[Kubernetes Operator](../../deploy/kubernetes.md) that allows you to easily -deploy and manage clusters within a Kubernetes environment. - -### On-premises - -Running ArangoDB on-premises means that ArangoDB is installed locally, on your -organization's computers and servers, and involves managing all the necessary -resources within the organization's environment, rather than using external -services. - -You can install ArangoDB locally by downloading and running the -[official packages](https://arangodb.com/download/) or run it using -[Docker images](../../operations/installation/docker.md). - -You can deploy it on-premises as a -[single server](../../deploy/single-instance/_index.md) -or as a [cluster](../../deploy/cluster/_index.md) -comprised of multiple nodes with synchronous replication and automatic failover -for high availability and resilience. For the highest level of data safety, -you can additionally set up off-site replication for your entire cluster -([Datacenter-to-Datacenter Replication](../../deploy/arangosync/_index.md)). - -ArangoDB also integrates with Kubernetes, offering a -[Kubernetes Operator](../../deploy/kubernetes.md) that lets you deploy in your -Kubernetes cluster. - -## ArangoDB Editions - -### Community Edition - -ArangoDB is freely available in a **Community Edition** under the Apache 2.0 -open-source license. It is a fully-featured version without time or size -restrictions and includes cluster support. - -- Open source under a permissive license -- One database core for all graph, document, key-value, and search needs -- A single composable query language for all data models -- Extensible through microservices with custom REST APIs and user-definable - query functions -- Cluster deployments for high availability and resilience - -See all [Community Edition Features](community-edition.md). - -### Enterprise Edition - -ArangoDB is also available in a commercial version, called the -**Enterprise Edition**. It includes additional features for performance and -security, such as for scaling graphs and managing your data safely. - -- Includes all Community Edition features -- Performance options to smartly shard and replicate graphs and datasets for - optimal data locality -- Multi-tenant deployment option for the transactional guarantees and - performance of a single server -- Enhanced data security with on-disk and backup encryption, key rotation, - audit logging, and LDAP authentication -- Incremental backups without downtime and off-site replication - -See all [Enterprise Edition Features](enterprise-edition.md). - -### Differences between the Editions - -| Community Edition | Enterprise Edition | -|-------------------|--------------------| -| Apache 2.0 License | Commercial License | -| Sharding using consistent hashing on the default or custom shard keys | In addition, **smart sharding** for improved data locality | -| Only hash-based graph sharding | **SmartGraphs** to intelligently shard large graph datasets and **EnterpriseGraphs** with an automatic sharding key selection | -| Only regular collection replication without data locality optimizations | **SatelliteCollections** to replicate collections on all cluster nodes and data locality optimizations for queries | -| No optimizations when querying sharded graphs and replicated collections together | **SmartGraphs using SatelliteCollections** to enable more local execution of graph queries | -| Only regular graph replication without local execution optimizations | **SatelliteGraphs** to execute graph traversals locally on a cluster node | -| Collections can be sharded alike but joins do not utilize co-location | **SmartJoins** for co-located joins in a cluster using identically sharded collections | -| Graph traversals without parallel execution | **Parallel execution of traversal queries** with many start vertices | -| Graph traversals always load full documents | **Traversal projections** optimize the data loading of AQL traversal queries if only a few document attributes are accessed | -| Iterative graph processing (Pregel) for single servers | **Pregel graph processing for clusters** and single servers | -| Inverted indexes and Views without support for search highlighting and nested search | **Search highlighting** for getting the substring positions of matches and **nested search** for matching arrays with all the conditions met by a single object | -| Only standard Jaccard index calculation | **Jaccard similarity approximation** with MinHash for entity resolution, such as for finding duplicate records, based on how many common elements they have |{{% comment %}} Experimental feature -| No fastText model support | Classification of text tokens and finding similar tokens using supervised **fastText word embedding models** | -{{% /comment %}} -| Only regular cluster deployments | **OneShard** deployment option to store all collections of a database on a single cluster node, to combine the performance of a single server and ACID semantics with a fault-tolerant cluster setup | -| ACID transactions for multi-document / multi-collection queries on single servers, for single document operations in clusters, and for multi-document queries in clusters for collections with a single shard | In addition, ACID transactions for multi-collection queries using the OneShard feature | -| Always read from leader shards in clusters | Optionally allow dirty reads to **read from followers** to scale reads | -| TLS key and certificate rotation | In addition, **key rotation for JWT secrets** and **server name indication** (SNI) | -| Built-in user management and authentication | Additional **LDAP authentication** option | -| Only server logs | **Audit log** of server interactions | -| No on-disk encryption | **Encryption at Rest** with hardware-accelerated on-disk encryption and key rotation | -| Only regular backups | **Datacenter-to-Datacenter Replication** for disaster recovery | -| Only unencrypted backups and basic data masking for backups | **Hot Backups**, **encrypted backups**, and **enhanced data masking** for backups | diff --git a/site/content/3.10/about-arangodb/use-cases.md b/site/content/3.10/about-arangodb/use-cases.md deleted file mode 100644 index fab9e86a90..0000000000 --- a/site/content/3.10/about-arangodb/use-cases.md +++ /dev/null @@ -1,164 +0,0 @@ ---- -title: ArangoDB Use Cases -menuTitle: Use Cases -weight: 15 -description: >- - ArangoDB is a database system with a large solution space because it combines - graphs, documents, key-value, search engine, and machine learning all in one -pageToc: - maxHeadlineLevel: 2 -aliases: - - ../introduction/use-cases ---- -## ArangoDB as a Graph Database - -ArangoDB as a graph database is a great fit for use cases like fraud detection, -knowledge graphs, recommendation engines, identity and access management, -network and IT operations, social media management, traffic management, and many -more. - -### Fraud Detection - -{{< image src="../../images/icon-fraud-detection.png" alt="Fraud Detection icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Uncover illegal activities by discovering difficult-to-detect patterns. -ArangoDB lets you look beyond individual data points in disparate data sources, -allowing you to integrate and harmonize data to analyze activities and -relationships all together, for a broader view of connection patterns, to detect -complex fraudulent behavior such as fraud rings. - -### Recommendation Engine - -{{< image src="../../images/icon-recommendation-engine.png" alt="Recommendation Engine icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Suggest products, services, and information to users based on data relationships. -For example, you can use ArangoDB together with PyTorch Geometric to build a -[movie recommendation system](https://www.arangodb.com/2022/04/integrate-arangodb-with-pytorch-geometric-to-build-recommendation-systems/), -by analyzing the movies users watched and then predicting links between the two -with a graph neural network (GNN). - -### Network Management - -{{< image src="../../images/icon-network-management.png" alt="Network Management icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Reduce downtime by connecting and visualizing network, infrastructure, and code. -Network devices and how they interconnect can naturally be modeled as a graph. -Traversal algorithms let you explore the routes between different nodes, with the -option to stop at subnet boundaries or to take things like the connection -bandwidth into account when path-finding. - -### Customer 360 - -{{< image src="../../images/icon-customer-360.png" alt="Customer 360 icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Gain a complete understanding of your customers by integrating multiple data -sources and code. ArangoDB can act as the platform to merge and consolidate -information in any shape, with the added ability to link related records and to -track data origins using graph features. - -### Identity and Access Management - -{{< image src="../../images/icon-identity-management.png" alt="Identity Management icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Increase security and compliance by managing data access based on role and -position. You can map out an organization chart as a graph and use ArangoDB to -determine who is authorized to see which information. Put ArangoDB's graph -capabilities to work to implement access control lists and permission -inheritance. - -### Supply Chain - -{{< image src="../../images/icon-supply-chain.png" alt="Supply Chain icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Speed shipments by monitoring and optimizing the flow of goods through a -supply chain. You can represent your inventory, supplier, and delivery -information as a graph to understand what the possible sources of delays and -disruptions are. - -## ArangoDB as a Document Database - -ArangoDB can be used as the backend for heterogeneous content management, -e-commerce systems, Internet of Things applications, and more generally as a -persistence layer for a broad range of services that benefit from an agile -and scalable data store. - -### Content Management - -{{< image src="../../images/icon-content-management.png" alt="Content management icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Store information of any kind without upfront schema declaration. ArangoDB is -schema-free, storing every data record as a self-contained document, allowing -you to manage heterogeneous content with ease. Build the next (headless) -content management system on top of ArangoDB. - -### E-Commerce Systems - -{{< image src="../../images/icon-e-commerce.png" alt="E-commerce icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -ArangoDB combines data modeling freedom with strong consistency and resilience -features to power online shops and ordering systems. Handle product catalog data -with ease using any combination of free text and structured data, and process -checkouts with the necessary transactional guarantees. - -### Internet of Things - -{{< image src="../../images/icon-internet-of-things.png" alt="Internet of things icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Collect sensor readings and other IoT data in ArangoDB for a single view of -everything. Store all data points in the same system that also lets you run -aggregation queries using sliding windows for efficient data analysis. - -## ArangoDB as a Key-Value Database - -{{< image src="../../images/icon-key-value.png" alt="Key value icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -Key-value stores are the simplest kind of database systems. Each record is -stored as a block of data under a key that uniquely identifies the record. -The data is opaque, which means the system doesn't know anything about the -contained information, it simply stores it and can retrieve it for you via -the identifiers. - -This paradigm is used at the heart of ArangoDB and allows it to scale well, -but without the limitations of a pure key-value store. Every document has a -`_key` attribute, which is either user-provided or automatically generated. -You can create additional indexes and work with subsets of attributes as -needed, requiring the system to be aware of the stored data structures - unlike -pure key-value stores. - -While ArangoDB can store binary data, it is not designed for -binary large objects (BLOBs) and works best with small to medium-sized -JSON objects. - -For more information about how ArangoDB persists data, see -[Storage Engine](../components/arangodb-server/storage-engine.md). - -## ArangoDB as a Search Engine - -{{< image src="../../images/icon-search-engine.png" alt="Search engine icon" style="float: right; padding: 0 20px; margin-bottom: 20px;">}} - -ArangoDB has a natively integrated search engine for a broad range of -information retrieval needs. It is powered by inverted indexes and can index -full-text, GeoJSON, as well as arbitrary JSON data. It supports various -kinds of search patterns (tokens, phrases, wildcard, fuzzy, geo-spatial, etc.) -and it can rank results by relevance and similarity using popular -scoring algorithms. - -It also features natural language processing (NLP) capabilities. -{{% comment %}} Experimental feature -and can classify or find similar terms using word embedding models. -{{% /comment %}} - -For more information about the search engine, see [ArangoSearch](../index-and-search/arangosearch/_index.md). - -## ArangoDB for Machine Learning - -You can use ArangoDB as the foundation for machine learning based on graphs -at enterprise scale. You can use it as a metadata store for model training -parameters, run analytical algorithms in the database, or serve operative -queries using data that you computed. - -ArangoDB integrates well into existing data infrastructures and provides -connectors for popular machine learning frameworks and data processing -ecosystems. - -![Machine Learning Architecture of ArangoDB](../../images/machine-learning-architecture.png) diff --git a/site/content/3.10/aql/examples-and-query-patterns/remove-vertex.md b/site/content/3.10/aql/examples-and-query-patterns/remove-vertex.md deleted file mode 100644 index 60a845ad94..0000000000 --- a/site/content/3.10/aql/examples-and-query-patterns/remove-vertex.md +++ /dev/null @@ -1,81 +0,0 @@ ---- -title: Remove vertices with AQL -menuTitle: Remove vertex -weight: 45 -description: >- - Removing connected edges along with vertex documents directly in AQL is - possible in a limited way ---- -Deleting vertices with associated edges is currently not handled via AQL while -the [graph management interface](../../graphs/general-graphs/management.md#remove-a-vertex) -and the -[REST API for the graph module](../../develop/http-api/graphs/named-graphs.md#remove-a-vertex) -offer a vertex deletion functionality. -However, as shown in this example based on the -[Knows Graph](../../graphs/example-graphs.md#knows-graph), a query for this -use case can be created. - -![Example Graph](../../../images/knows_graph.png) - -When deleting vertex **eve** from the graph, we also want the edges -`eve -> alice` and `eve -> bob` to be removed. -The involved graph and its only edge collection has to be known. In this case it -is the graph **knows_graph** and the edge collection **knows**. - -This query will delete **eve** with its adjacent edges: - -```aql ---- -name: GRAPHTRAV_removeVertex1 -description: '' -dataset: knows_graph ---- -LET edgeKeys = (FOR v, e IN 1..1 ANY 'persons/eve' GRAPH 'knows_graph' RETURN e._key) -LET r = (FOR key IN edgeKeys REMOVE key IN knows) -REMOVE 'eve' IN persons -``` - -This query executed several actions: -- use a graph traversal of depth 1 to get the `_key` of **eve's** adjacent edges -- remove all of these edges from the `knows` collection -- remove vertex **eve** from the `persons` collection - -The following query shows a different design to achieve the same result: - -```aql ---- -name: GRAPHTRAV_removeVertex2 -description: '' -dataset: knows_graph ---- -LET edgeKeys = (FOR v, e IN 1..1 ANY 'persons/eve' GRAPH 'knows_graph' - REMOVE e._key IN knows) -REMOVE 'eve' IN persons -``` - -**Note**: The query has to be adjusted to match a graph with multiple vertex/edge collections. - -For example, the [City Graph](../../graphs/example-graphs.md#city-graph) -contains several vertex collections - `germanCity` and `frenchCity` and several -edge collections - `french / german / international Highway`. - -![Example Graph2](../../../images/cities_graph.png) - -To delete city **Berlin** all edge collections `french / german / international Highway` -have to be considered. The **REMOVE** operation has to be applied on all edge -collections with `OPTIONS { ignoreErrors: true }`. Not using this option will stop the query -whenever a non existing key should be removed in a collection. - -```aql ---- -name: GRAPHTRAV_removeVertex3 -description: '' -dataset: routeplanner ---- -LET edgeKeys = (FOR v, e IN 1..1 ANY 'germanCity/Berlin' GRAPH 'routeplanner' RETURN e._key) -LET r = (FOR key IN edgeKeys REMOVE key IN internationalHighway - OPTIONS { ignoreErrors: true } REMOVE key IN germanHighway - OPTIONS { ignoreErrors: true } REMOVE key IN frenchHighway - OPTIONS { ignoreErrors: true }) -REMOVE 'Berlin' IN germanCity -``` diff --git a/site/content/3.10/aql/examples-and-query-patterns/traversals.md b/site/content/3.10/aql/examples-and-query-patterns/traversals.md deleted file mode 100644 index 3b9452edbc..0000000000 --- a/site/content/3.10/aql/examples-and-query-patterns/traversals.md +++ /dev/null @@ -1,118 +0,0 @@ ---- -title: Combining AQL Graph Traversals -menuTitle: Traversals -weight: 40 -description: >- - You can combine graph queries with other AQL features like geo-spatial search ---- -## Finding the start vertex via a geo query - -Our first example will locate the start vertex for a graph traversal via [a geo index](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md). -We use the [City Graph](../../graphs/example-graphs.md#city-graph) and its geo indexes: - -![Cities Example Graph](../../../images/cities_graph.png) - -```js ---- -name: COMBINING_GRAPH_01_create_graph -description: '' ---- -var examples = require("@arangodb/graph-examples/example-graph"); -var g = examples.loadGraph("routeplanner"); -~examples.dropGraph("routeplanner"); -``` - -We search all german cities in a range of 400 km around the ex-capital **Bonn**: **Hamburg** and **Cologne**. -We won't find **Paris** since its in the `frenchCity` collection. - -```aql ---- -name: COMBINING_GRAPH_02_show_geo -description: '' -dataset: routeplanner -bindVars: - { - "bonn": [7.0998, 50.7340], - "radius": 400000 - } ---- -FOR startCity IN germanCity - FILTER GEO_DISTANCE(@bonn, startCity.geometry) < @radius - RETURN startCity._key -``` - -Let's revalidate that the geo indexes are actually used: - -```aql ---- -name: COMBINING_GRAPH_03_explain_geo -description: '' -dataset: routeplanner -explain: true -bindVars: - { - "bonn": [7.0998, 50.7340], - "radius": 400000 - } ---- -FOR startCity IN germanCity - FILTER GEO_DISTANCE(@bonn, startCity.geometry) < @radius - RETURN startCity._key -``` - -And now combine this with a graph traversal: - -```aql ---- -name: COMBINING_GRAPH_04_combine -description: '' -dataset: routeplanner -bindVars: - { - "bonn": [7.0998, 50.7340], - "radius": 400000 - } ---- -FOR startCity IN germanCity - FILTER GEO_DISTANCE(@bonn, startCity.geometry) < @radius - FOR v, e, p IN 1..1 OUTBOUND startCity - GRAPH 'routeplanner' - RETURN {startcity: startCity._key, traversedCity: v._key} -``` - -The geo index query returns us `startCity` (**Cologne** and **Hamburg**) which we then use as starting point for our graph traversal. -For simplicity we only return their direct neighbours. We format the return result so we can see from which `startCity` the traversal came. - -Alternatively we could use a `LET` statement with a subquery to group the traversals by their `startCity` efficiently: - -```aql ---- -name: COMBINING_GRAPH_05_combine_let -description: '' -dataset: routeplanner -bindVars: - { - "bonn": [7.0998, 50.7340], - "radius": 400000 - } ---- -FOR startCity IN germanCity - FILTER GEO_DISTANCE(@bonn, startCity.geometry) < @radius - LET oneCity = ( - FOR v, e, p IN 1..1 OUTBOUND startCity - GRAPH 'routeplanner' RETURN v._key - ) - RETURN {startCity: startCity._key, connectedCities: oneCity} -``` - -Finally, we clean up again: - -```js ---- -name: COMBINING_GRAPH_06_cleanup -description: '' ---- -~var examples = require("@arangodb/graph-examples/example-graph"); -~var g = examples.loadGraph("routeplanner"); -examples.dropGraph("routeplanner"); -``` diff --git a/site/content/3.10/aql/functions/arangosearch.md b/site/content/3.10/aql/functions/arangosearch.md deleted file mode 100644 index 0b0107c595..0000000000 --- a/site/content/3.10/aql/functions/arangosearch.md +++ /dev/null @@ -1,1371 +0,0 @@ ---- -title: ArangoSearch functions in AQL -menuTitle: ArangoSearch -weight: 5 -description: >- - ArangoSearch offers various AQL functions for search queries to control the search context, for filtering and scoring -pageToc: - maxHeadlineLevel: 3 ---- -You can form search expressions by composing ArangoSearch function calls, -logical operators and comparison operators. This allows you to filter Views -as well as to utilize inverted indexes to filter collections. - -The AQL [`SEARCH` operation](../high-level-operations/search.md) accepts search expressions, -such as `PHRASE(doc.text, "foo bar", "text_en")`, for querying Views. You can -combine ArangoSearch filter and context functions as well as operators like -`AND` and `OR` to form complex search conditions. Similarly, the -[`FILTER` operation](../high-level-operations/filter.md) accepts such search expressions -when using [inverted indexes](../../index-and-search/indexing/working-with-indexes/inverted-indexes.md). - -Scoring functions allow you to rank matches and to sort results by relevance. -They are limited to Views. - -Search highlighting functions let you retrieve the string positions of matches. -They are limited to Views. - -You can use most functions also without an inverted index or a View and the -`SEARCH` keyword, but then they are not accelerated by an index. - -See [Information Retrieval with ArangoSearch](../../index-and-search/arangosearch/_index.md) for an -introduction. - -## Context Functions - -### ANALYZER() - -`ANALYZER(expr, analyzer) → retVal` - -Sets the Analyzer for the given search expression. - -{{< info >}} -The `ANALYZER()` function is only applicable for queries against `arangosearch` Views. - -In queries against `search-alias` Views and inverted indexes, you don't need to -specify Analyzers because every field can be indexed with a single Analyzer only -and they are inferred from the index definition. -{{< /info >}} - -The default Analyzer is `identity` for any search expression that is used for -filtering `arangosearch` Views. This utility function can be used -to wrap a complex expression to set a particular Analyzer. It also sets it for -all the nested functions which require such an argument to avoid repeating the -Analyzer parameter. If an Analyzer argument is passed to a nested function -regardless, then it takes precedence over the Analyzer set via `ANALYZER()`. - -The `TOKENS()` function is an exception. It requires the Analyzer name to be -passed in in all cases even if wrapped in an `ANALYZER()` call, because it is -not an ArangoSearch function but a regular string function which can be used -outside of `SEARCH` operations. - -- **expr** (expression): any valid search expression -- **analyzer** (string): name of an [Analyzer](../../index-and-search/analyzers.md). -- returns **retVal** (any): the expression result that it wraps - -#### Example: Using a custom Analyzer - -Assuming a View definition with an Analyzer whose name and type is `delimiter`: - -```json -{ - "links": { - "coll": { - "analyzers": [ "delimiter" ], - "includeAllFields": true, - } - }, - ... -} -``` - -… with the Analyzer properties `{ "delimiter": "|" }` and an example document -`{ "text": "foo|bar|baz" }` in the collection `coll`, the following query would -return the document: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(doc.text == "bar", "delimiter") - RETURN doc -``` - -The expression `doc.text == "bar"` has to be wrapped by `ANALYZER()` in order -to set the Analyzer to `delimiter`. Otherwise the expression would be evaluated -with the default `identity` Analyzer. `"foo|bar|baz" == "bar"` would not match, -but the View does not even process the indexed fields with the `identity` -Analyzer. The following query would also return an empty result because of -the Analyzer mismatch: - -```aql -FOR doc IN viewName - SEARCH doc.text == "foo|bar|baz" - //SEARCH ANALYZER(doc.text == "foo|bar|baz", "identity") - RETURN doc -``` - -#### Example: Setting the Analyzer context with and without `ANALYZER()` - -In below query, the search expression is swapped by `ANALYZER()` to set the -`text_en` Analyzer for both `PHRASE()` functions: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(PHRASE(doc.text, "foo") OR PHRASE(doc.text, "bar"), "text_en") - RETURN doc -``` - -Without the usage of `ANALYZER()`: - -```aql -FOR doc IN viewName - SEARCH PHRASE(doc.text, "foo", "text_en") OR PHRASE(doc.text, "bar", "text_en") - RETURN doc -``` - -#### Example: Analyzer precedence and specifics of the `TOKENS()` function - -In the following example `ANALYZER()` is used to set the Analyzer `text_en`, -but in the second call to `PHRASE()` a different Analyzer is set (`identity`) -which overrules `ANALYZER()`. Therefore, the `text_en` Analyzer is used to find -the phrase *foo* and the `identity` Analyzer to find *bar*: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(PHRASE(doc.text, "foo") OR PHRASE(doc.text, "bar", "identity"), "text_en") - RETURN doc -``` - -Despite the wrapping `ANALYZER()` function, the Analyzer name cannot be -omitted in calls to the `TOKENS()` function. Both occurrences of `text_en` -are required, to set the Analyzer for the expression `doc.text IN ...` and -for the `TOKENS()` function itself. This is because the `TOKENS()` function -is a regular string function that does not take the Analyzer context into -account: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(doc.text IN TOKENS("foo", "text_en"), "text_en") - RETURN doc -``` - -### BOOST() - -`BOOST(expr, boost) → retVal` - -Override boost in the context of a search expression with a specified value, -making it available for scorer functions. By default, the context has a boost -value equal to `1.0`. - -- **expr** (expression): any valid search expression -- **boost** (number): numeric boost value -- returns **retVal** (any): the expression result that it wraps - -#### Example: Boosting a search sub-expression - -```aql -FOR doc IN viewName - SEARCH ANALYZER(BOOST(doc.text == "foo", 2.5) OR doc.text == "bar", "text_en") - LET score = BM25(doc) - SORT score DESC - RETURN { text: doc.text, score } -``` - -Assuming a View with the following documents indexed and processed by the -`text_en` Analyzer: - -```js -{ "text": "foo bar" } -{ "text": "foo" } -{ "text": "bar" } -{ "text": "foo baz" } -{ "text": "baz" } -``` - -… the result of above query would be: - -```json -[ - { - "text": "foo bar", - "score": 2.787301540374756 - }, - { - "text": "foo baz", - "score": 1.6895781755447388 - }, - { - "text": "foo", - "score": 1.525835633277893 - }, - { - "text": "bar", - "score": 0.9913395643234253 - } -] -``` - -## Filter Functions - -### EXISTS() - -{{< info >}} -If you use `arangosearch` Views, the `EXISTS()` function only matches values if -you set the **storeValues** link property to `"id"` in the View definition -(the default is `"none"`). -{{< /info >}} - -#### Testing for attribute presence - -`EXISTS(path)` - -Match documents where the attribute at `path` is present. - -- **path** (attribute path expression): the attribute to test in the document -- returns nothing: the function evaluates to a boolean, but this value cannot be - returned. The function can only be called in a search expression. It throws - an error if used outside of a [`SEARCH` operation](../high-level-operations/search.md) or - a `FILTER` operation that uses an inverted index. - -```aql -FOR doc IN viewName - SEARCH EXISTS(doc.text) - RETURN doc -``` - -#### Testing for attribute type - -`EXISTS(path, type)` - -Match documents where the attribute at `path` is present _and_ is of the -specified data type. - -- **path** (attribute path expression): the attribute to test in the document -- **type** (string): data type to test for, can be one of: - - `"null"` - - `"bool"` / `"boolean"` - - `"numeric"` - - `"type"` (matches `null`, `boolean`, and `numeric` values) - - `"string"` - - `"analyzer"` (see below) -- returns nothing: the function evaluates to a boolean, but this value cannot be - returned. The function can only be called in a search expression. It throws - an error if used outside of a [`SEARCH` operation](../high-level-operations/search.md) or - a `FILTER` operation that uses an inverted index. - -```aql -FOR doc IN viewName - SEARCH EXISTS(doc.text, "string") - RETURN doc -``` - -#### Testing for Analyzer index status - -`EXISTS(path, "analyzer", analyzer)` - -Match documents where the attribute at `path` is present _and_ was indexed -by the specified `analyzer`. - -- **path** (attribute path expression): the attribute to test in the document -- **type** (string): string literal `"analyzer"` -- **analyzer** (string, _optional_): name of an [Analyzer](../../index-and-search/analyzers.md). - Uses the Analyzer of a wrapping `ANALYZER()` call if not specified or - defaults to `"identity"` -- returns nothing: the function evaluates to a boolean, but this value cannot be - returned. The function can only be called in a search expression. It throws - an error if used outside of a [`SEARCH` operation](../high-level-operations/search.md) or - a `FILTER` operation that uses an inverted index. - -```aql -FOR doc IN viewName - SEARCH EXISTS(doc.text, "analyzer", "text_en") - RETURN doc -``` - -#### Testing for nested fields - -`EXISTS(path, "nested")` - -Match documents where the attribute at `path` is present _and_ is indexed -as a nested field for [nested search with Views](../../index-and-search/arangosearch/nested-search.md) -or [inverted indexes](../../index-and-search/indexing/working-with-indexes/inverted-indexes.md#nested-search-enterprise-edition). - -- **path** (attribute path expression): the attribute to test in the document -- **type** (string): string literal `"nested"` -- returns nothing: the function evaluates to a boolean, but this value cannot be - returned. The function can only be called in a search expression. It throws - an error if used outside of a [`SEARCH` operation](../high-level-operations/search.md) or - a `FILTER` operation that uses an inverted index. - -**Examples** - -Only return documents from the View `viewName` whose `text` attribute is indexed -as a nested field: - -```aql -FOR doc IN viewName - SEARCH EXISTS(doc.text, "nested") - RETURN doc -``` - -Only return documents whose `attr` attribute and its nested `text` attribute are -indexed as nested fields: - -```aql -FOR doc IN viewName - SEARCH doc.attr[? FILTER EXISTS(CURRENT.text, "nested")] - RETURN doc -``` - -Only return documents from the collection `coll` whose `text` attribute is indexed -as a nested field by an inverted index: - -```aql -FOR doc IN coll OPTIONS { indexHint: "inv-idx", forceIndexHint: true } - FILTER EXISTS(doc.text, "nested") - RETURN doc -``` - -Only return documents whose `attr` attribute and its nested `text` attribute are -indexed as nested fields: - -```aql -FOR doc IN coll OPTIONS { indexHint: "inv-idx", forceIndexHint: true } - FILTER doc.attr[? FILTER EXISTS(CURRENT.text, "nested")] - RETURN doc -``` - -### IN_RANGE() - -`IN_RANGE(path, low, high, includeLow, includeHigh) → included` - -Match documents where the attribute at `path` is greater than (or equal to) -`low` and less than (or equal to) `high`. - -You can use `IN_RANGE()` for searching more efficiently compared to an equivalent -expression that combines two comparisons with a logical conjunction: - -- `IN_RANGE(path, low, high, true, true)` instead of `low <= value AND value <= high` -- `IN_RANGE(path, low, high, true, false)` instead of `low <= value AND value < high` -- `IN_RANGE(path, low, high, false, true)` instead of `low < value AND value <= high` -- `IN_RANGE(path, low, high, false, false)` instead of `low < value AND value < high` - -`low` and `high` can be numbers or strings (technically also `null`, `true` -and `false`), but the data type must be the same for both. - -{{< warning >}} -The alphabetical order of characters is not taken into account by ArangoSearch, -i.e. range queries in SEARCH operations against Views will not follow the -language rules as per the defined Analyzer locale (except for the -[`collation` Analyzer](../../index-and-search/analyzers.md#collation)) nor the server language -(startup option `--default-language`)! -Also see [Known Issues](../../release-notes/version-3.10/known-issues-in-3-10.md#arangosearch). -{{< /warning >}} - -There is a corresponding [`IN_RANGE()` Miscellaneous Function](miscellaneous.md#in_range) -that is used outside of `SEARCH` operations. - -- **path** (attribute path expression): - the path of the attribute to test in the document -- **low** (number\|string): minimum value of the desired range -- **high** (number\|string): maximum value of the desired range -- **includeLow** (bool): whether the minimum value shall be included in - the range (left-closed interval) or not (left-open interval) -- **includeHigh** (bool): whether the maximum value shall be included in - the range (right-closed interval) or not (right-open interval) -- returns **included** (bool): whether `value` is in the range - -If `low` and `high` are the same, but `includeLow` and/or `includeHigh` is set -to `false`, then nothing will match. If `low` is greater than `high` nothing will -match either. - -#### Example: Using numeric ranges - -To match documents with the attribute `value >= 3` and `value <= 5` using the -default `"identity"` Analyzer you would write the following query: - -```aql -FOR doc IN viewName - SEARCH IN_RANGE(doc.value, 3, 5, true, true) - RETURN doc.value -``` - -This will also match documents which have an array of numbers as `value` -attribute where at least one of the numbers is in the specified boundaries. - -#### Example: Using string ranges - -Using string boundaries and a text Analyzer allows to match documents which -have at least one token within the specified character range: - -```aql -FOR doc IN valView - SEARCH ANALYZER(IN_RANGE(doc.value, "a","f", true, false), "text_en") - RETURN doc -``` - -This will match `{ "value": "bar" }` and `{ "value": "foo bar" }` because the -_b_ of _bar_ is in the range (`"a" <= "b" < "f"`), but not `{ "value": "foo" }` -because the _f_ of _foo_ is excluded (`high` is "f" but `includeHigh` is false). - -### MIN_MATCH() - -`MIN_MATCH(expr1, ... exprN, minMatchCount) → fulfilled` - -Match documents where at least `minMatchCount` of the specified -search expressions are satisfied. - -There is a corresponding [`MIN_MATCH()` Miscellaneous function](miscellaneous.md#min_match) -that is used outside of `SEARCH` operations. - -- **expr** (expression, _repeatable_): any valid search expression -- **minMatchCount** (number): minimum number of search expressions that should - be satisfied -- returns **fulfilled** (bool): whether at least `minMatchCount` of the - specified expressions are `true` - -#### Example: Matching a subset of search sub-expressions - -Assuming a View with a text Analyzer, you may use it to match documents where -the attribute contains at least two out of three tokens: - -```aql -LET t = TOKENS("quick brown fox", "text_en") -FOR doc IN viewName - SEARCH ANALYZER(MIN_MATCH(doc.text == t[0], doc.text == t[1], doc.text == t[2], 2), "text_en") - RETURN doc.text -``` - -This will match `{ "text": "the quick brown fox" }` and `{ "text": "some brown fox" }`, -but not `{ "text": "snow fox" }` which only fulfills one of the conditions. - -Note that you can also use the `AT LEAST` [array comparison operator](../high-level-operations/search.md#array-comparison-operators) -in the specific case of matching a subset of tokens against a single attribute: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(TOKENS("quick brown fox", "text_en") AT LEAST (2) == doc.text, "text_en") - RETURN doc.text -``` - -### MINHASH_MATCH() - -`MINHASH_MATCH(path, target, threshold, analyzer) → fulfilled` - -Match documents with an approximate Jaccard similarity of at least the -`threshold`, approximated with the specified `minhash` Analyzer. - -To only compute the MinHash signatures, see the -[`MINHASH()` Miscellaneous function](miscellaneous.md#minhash). - -- **path** (attribute path expression\|string): the path of the attribute in - a document or a string -- **target** (string): the string to hash with the specified Analyzer and to - compare against the stored attribute -- **threshold** (number, _optional_): a value between `0.0` and `1.0`. -- **analyzer** (string): the name of a [`minhash` Analyzer](../../index-and-search/analyzers.md#minhash). -- returns **fulfilled** (bool): `true` if the approximate Jaccard similarity - is greater than or equal to the specified threshold, `false` otherwise - -#### Example: Find documents with a text similar to a target text - -Assuming a View with a `minhash` Analyzer, you can use the stored -MinHash signature to find candidates for the more expensive Jaccard similarity -calculation: - -```aql -LET target = "the quick brown fox jumps over the lazy dog" -LET targetSignature = TOKENS(target, "myMinHash") - -FOR doc IN viewName - SEARCH MINHASH_MATCH(doc.text, target, 0.5, "myMinHash") // approximation - LET jaccard = JACCARD(targetSignature, TOKENS(doc.text, "myMinHash")) - FILTER jaccard > 0.75 - SORT jaccard DESC - RETURN doc.text -``` - -### NGRAM_MATCH() - -Introduced in: v3.7.0 - -`NGRAM_MATCH(path, target, threshold, analyzer) → fulfilled` - -Match documents whose attribute value has an -[_n_-gram similarity](https://webdocs.cs.ualberta.ca/~kondrak/papers/spire05.pdf) -higher than the specified threshold compared to the target value. - -The similarity is calculated by counting how long the longest sequence of -matching _n_-grams is, divided by the target's total _n_-gram count. -Only fully matching _n_-grams are counted. - -The _n_-grams for both attribute and target are produced by the specified -Analyzer. Increasing the _n_-gram length will increase accuracy, but reduce -error tolerance. In most cases a size of 2 or 3 will be a good choice. - -Also see the String Functions -[`NGRAM_POSITIONAL_SIMILARITY()`](string.md#ngram_positional_similarity) -and [`NGRAM_SIMILARITY()`](string.md#ngram_similarity) -for calculating _n_-gram similarity that cannot be accelerated by a View index. - -- **path** (attribute path expression\|string): the path of the attribute in - a document or a string -- **target** (string): the string to compare against the stored attribute -- **threshold** (number, _optional_): a value between `0.0` and `1.0`. Defaults - to `0.7` if none is specified. -- **analyzer** (string): the name of an [Analyzer](../../index-and-search/analyzers.md). -- returns **fulfilled** (bool): `true` if the evaluated _n_-gram similarity value - is greater than or equal to the specified threshold, `false` otherwise - -{{< info >}} -Use an Analyzer of type `ngram` with `preserveOriginal: false` and `min` equal -to `max`. Otherwise, the similarity score calculated internally will be lower -than expected. - -The Analyzer must have the `"position"` and `"frequency"` features enabled or -the `NGRAM_MATCH()` function will not find anything. -{{< /info >}} - -#### Example: Using a custom bigram Analyzer - -Given a View indexing an attribute `text`, a custom _n_-gram Analyzer `"bigram"` -(`min: 2, max: 2, preserveOriginal: false, streamType: "utf8"`) and a document -`{ "text": "quick red fox" }`, the following query would match it (with a -threshold of `1.0`): - -```aql -FOR doc IN viewName - SEARCH NGRAM_MATCH(doc.text, "quick fox", "bigram") - RETURN doc.text -``` - -The following will also match (note the low threshold value): - -```aql -FOR doc IN viewName - SEARCH NGRAM_MATCH(doc.text, "quick blue fox", 0.4, "bigram") - RETURN doc.text -``` - -The following will not match (note the high threshold value): - -```aql -FOR doc IN viewName - SEARCH NGRAM_MATCH(doc.text, "quick blue fox", 0.9, "bigram") - RETURN doc.text -``` - -#### Example: Using constant values - -`NGRAM_MATCH()` can be called with constant arguments, but for such calls the -`analyzer` argument is mandatory (even for calls inside of a `SEARCH` clause): - -```aql -FOR doc IN viewName - SEARCH NGRAM_MATCH("quick fox", "quick blue fox", 0.9, "bigram") - RETURN doc.text -``` - -```aql -RETURN NGRAM_MATCH("quick fox", "quick blue fox", "bigram") -``` - -### PHRASE() - -`PHRASE(path, phrasePart, analyzer)` - -`PHRASE(path, phrasePart1, skipTokens1, ... phrasePartN, skipTokensN, analyzer)` - -`PHRASE(path, [ phrasePart1, skipTokens1, ... phrasePartN, skipTokensN ], analyzer)` - -Search for a phrase in the referenced attribute. It only matches documents in -which the tokens appear in the specified order. To search for tokens in any -order use [`TOKENS()`](string.md#tokens) instead. - -The phrase can be expressed as an arbitrary number of `phraseParts` separated by -*skipTokens* number of tokens (wildcards), either as separate arguments or as -array as second argument. - -- **path** (attribute path expression): the attribute to test in the document -- **phrasePart** (string\|array\|object): text to search for in the tokens. - Can also be an [array](#example-using-phrase-with-an-array-of-tokens) - comprised of string, array and [object tokens](#object-tokens), or tokens - interleaved with numbers of `skipTokens`. The specified `analyzer` is applied - to string and array tokens, but not for object tokens. -- **skipTokens** (number, _optional_): amount of tokens to treat - as wildcards -- **analyzer** (string, _optional_): name of an [Analyzer](../../index-and-search/analyzers.md). - Uses the Analyzer of a wrapping `ANALYZER()` call if not specified or - defaults to `"identity"` -- returns nothing: the function evaluates to a boolean, but this value cannot be - returned. The function can only be called in a search expression. It throws - an error if used outside of a [`SEARCH` operation](../high-level-operations/search.md) or - a `FILTER` operation that uses an inverted index. - -{{< info >}} -The selected Analyzer must have the `"position"` and `"frequency"` features -enabled. The `PHRASE()` function will otherwise not find anything. -{{< /info >}} - -#### Object tokens - -Introduced in v3.7.0 - -- `{IN_RANGE: [low, high, includeLow, includeHigh]}`: - see [IN_RANGE()](#in_range). *low* and *high* can only be strings. -- `{LEVENSHTEIN_MATCH: [token, maxDistance, transpositions, maxTerms, prefix]}`: - - `token` (string): a string to search - - `maxDistance` (number): maximum Levenshtein / Damerau-Levenshtein distance - - `transpositions` (bool, _optional_): if set to `false`, a Levenshtein - distance is computed, otherwise a Damerau-Levenshtein distance (default) - - `maxTerms` (number, _optional_): consider only a specified number of the - most relevant terms. One can pass `0` to consider all matched terms, but it may - impact performance negatively. The default value is `64`. - - `prefix` (string, _optional_): if defined, then a search for the exact - prefix is carried out, using the matches as candidates. The Levenshtein / - Damerau-Levenshtein distance is then computed for each candidate using the - remainders of the strings. This option can improve performance in cases where - there is a known common prefix. The default value is an empty string - (introduced in v3.7.13, v3.8.1). -- `{STARTS_WITH: [prefix]}`: see [STARTS_WITH()](#starts_with). - Array brackets are optional -- `{TERM: [token]}`: equal to `token` but without Analyzer tokenization. - Array brackets are optional -- `{TERMS: [token1, ..., tokenN]}`: one of `token1, ..., tokenN` can be found - in specified position. Inside an array the object syntax can be replaced with - the object field value, e.g., `[..., [token1, ..., tokenN], ...]`. -- `{WILDCARD: [token]}`: see [LIKE()](#like). - Array brackets are optional - -An array token inside an array can be used in the `TERMS` case only. - -Also see [Example: Using object tokens](#example-using-object-tokens). - -#### Example: Using a text Analyzer for a phrase search - -Given a View indexing an attribute `text` with the `"text_en"` Analyzer and a -document `{ "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit" }`, -the following query would match it: - -```aql -FOR doc IN viewName - SEARCH PHRASE(doc.text, "lorem ipsum", "text_en") - RETURN doc.text -``` - -However, this search expression does not because the tokens `"ipsum"` and -`"lorem"` do not appear in this order: - -```aql -PHRASE(doc.text, "ipsum lorem", "text_en") -``` - -#### Example: Skip tokens for a proximity search - -To match `"ipsum"` and `"amet"` with any two tokens in between, you can use the -following search expression: - -```aql -PHRASE(doc.text, "ipsum", 2, "amet", "text_en") -``` - -The `skipTokens` value of `2` defines how many wildcard tokens have to appear -between *ipsum* and *amet*. A `skipTokens` value of `0` means that the tokens -must be adjacent. Negative values are allowed, but not very useful. These three -search expressions are equivalent: - -```aql -PHRASE(doc.text, "lorem ipsum", "text_en") -PHRASE(doc.text, "lorem", 0, "ipsum", "text_en") -PHRASE(doc.text, "ipsum", -1, "lorem", "text_en") -``` - -#### Example: Using `PHRASE()` with an array of tokens - -The `PHRASE()` function also accepts an array as second argument with -`phrasePart` and `skipTokens` parameters as elements. - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, ["quick brown fox"], "text_en") RETURN doc -FOR doc IN myView SEARCH PHRASE(doc.title, ["quick", "brown", "fox"], "text_en") RETURN doc -``` - -This syntax variation enables the usage of computed expressions: - -```aql -LET proximityCondition = [ "foo", ROUND(RAND()*10), "bar" ] -FOR doc IN viewName - SEARCH PHRASE(doc.text, proximityCondition, "text_en") - RETURN doc -``` - -```aql -LET tokens = TOKENS("quick brown fox", "text_en") // ["quick", "brown", "fox"] -FOR doc IN myView SEARCH PHRASE(doc.title, tokens, "text_en") RETURN doc -``` - -Above example is equivalent to the more cumbersome and static form: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, "quick", 0, "brown", 0, "fox", "text_en") RETURN doc -``` - -You can optionally specify the number of skipTokens in the array form before -every string element: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, ["quick", 1, "fox", "jumps"], "text_en") RETURN doc -``` - -It is the same as the following: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, "quick", 1, "fox", 0, "jumps", "text_en") RETURN doc -``` - -#### Example: Handling of arrays with no members - -Empty arrays are skipped: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, "quick", 1, [], 1, "jumps", "text_en") RETURN doc -``` - -The query is equivalent to: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, "quick", 2 "jumps", "text_en") RETURN doc -``` - -Providing only empty arrays is valid, but will yield no results. - -#### Example: Using object tokens - -Using object tokens `STARTS_WITH`, `WILDCARD`, `LEVENSHTEIN_MATCH`, `TERMS` and -`IN_RANGE`: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, - {STARTS_WITH: ["qui"]}, 0, - {WILDCARD: ["b%o_n"]}, 0, - {LEVENSHTEIN_MATCH: ["foks", 2]}, 0, - {TERMS: ["jump", "run"]}, 0, // Analyzer not applied! - {IN_RANGE: ["over", "through", true, false]}, - "text_en") RETURN doc -``` - -Note that the `text_en` Analyzer has stemming enabled, but for object tokens -the Analyzer isn't applied. `{TERMS: ["jumps", "runs"]}` would not match the -indexed (and stemmed!) attribute value. Therefore, the trailing `s` which would -be stemmed away is removed from both words manually in the example. - -Above example is equivalent to: - -```aql -FOR doc IN myView SEARCH PHRASE(doc.title, -[ - {STARTS_WITH: "qui"}, 0, - {WILDCARD: "b%o_n"}, 0, - {LEVENSHTEIN_MATCH: ["foks", 2]}, 0, - ["jumps", "runs"], 0, // Analyzer is applied using this syntax - {IN_RANGE: ["over", "through", true, false]} -], "text_en") RETURN doc -``` - -### STARTS_WITH() - -`STARTS_WITH(path, prefix) → startsWith` - -Match the value of the attribute that starts with `prefix`. If the attribute -is processed by a tokenizing Analyzer (type `"text"` or `"delimiter"`) or if it -is an array, then a single token/element starting with the prefix is sufficient -to match the document. - -{{< warning >}} -The alphabetical order of characters is not taken into account by ArangoSearch, -i.e. range queries in SEARCH operations against Views will not follow the -language rules as per the defined Analyzer locale (except for the -[`collation` Analyzer](../../index-and-search/analyzers.md#collation)) nor the server language -(startup option `--default-language`)! -Also see [Known Issues](../../release-notes/version-3.10/known-issues-in-3-10.md#arangosearch). -{{< /warning >}} - -There is a corresponding [`STARTS_WITH()` String function](string.md#starts_with) -that is used outside of `SEARCH` operations. - -- **path** (attribute path expression): the path of the attribute to compare - against in the document -- **prefix** (string): a string to search at the start of the text -- returns **startsWith** (bool): whether the specified attribute starts with - the given prefix - ---- - -`STARTS_WITH(path, prefixes, minMatchCount) → startsWith` - -Introduced in: v3.7.1 - -Match the value of the attribute that starts with one of the `prefixes`, or -optionally with at least `minMatchCount` of the prefixes. - -- **path** (attribute path expression): the path of the attribute to compare - against in the document -- **prefixes** (array): an array of strings to search at the start of the text -- **minMatchCount** (number, _optional_): minimum number of search prefixes - that should be satisfied (see - [example](#example-searching-for-one-or-multiple-prefixes)). The default is `1` -- returns **startsWith** (bool): whether the specified attribute starts with at - least `minMatchCount` of the given prefixes - -#### Example: Searching for an exact value prefix - -To match a document `{ "text": "lorem ipsum..." }` using a prefix and the -`"identity"` Analyzer you can use it like this: - -```aql -FOR doc IN viewName - SEARCH STARTS_WITH(doc.text, "lorem ip") - RETURN doc -``` - -#### Example: Searching for a prefix in text - -This query will match `{ "text": "lorem ipsum" }` as well as -`{ "text": [ "lorem", "ipsum" ] }` given a View which indexes the `text` -attribute and processes it with the `"text_en"` Analyzer: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(STARTS_WITH(doc.text, "ips"), "text_en") - RETURN doc.text -``` - -Note that it will not match `{ "text": "IPS (in-plane switching)" }` without -modification to the query. The prefixes were passed to `STARTS_WITH()` as-is, -but the built-in `text_en` Analyzer used for indexing has stemming enabled. -So the indexed values are the following: - -```aql -RETURN TOKENS("IPS (in-plane switching)", "text_en") -``` - -```json -[ - [ - "ip", - "in", - "plane", - "switch" - ] -] -``` - -The *s* is removed from *ips*, which leads to the prefix *ips* not matching -the indexed token *ip*. You may either create a custom text Analyzer with -stemming disabled to avoid this issue, or apply stemming to the prefixes: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(STARTS_WITH(doc.text, TOKENS("ips", "text_en")), "text_en") - RETURN doc.text -``` - -#### Example: Searching for one or multiple prefixes - -The `STARTS_WITH()` function accepts an array of prefix alternatives of which -only one has to match: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(STARTS_WITH(doc.text, ["something", "ips"]), "text_en") - RETURN doc.text -``` - -It will match a document `{ "text": "lorem ipsum" }` but also -`{ "text": "that is something" }`, as at least one of the words start with a -given prefix. - -The same query again, but with an explicit `minMatchCount`: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(STARTS_WITH(doc.text, ["wrong", "ips"], 1), "text_en") - RETURN doc.text -``` - -The number can be increased to require that at least this many prefixes must -be present: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(STARTS_WITH(doc.text, ["lo", "ips", "something"], 2), "text_en") - RETURN doc.text -``` - -This will still match `{ "text": "lorem ipsum" }` because at least two prefixes -(`lo` and `ips`) are found, but not `{ "text": "that is something" }` which only -contains one of the prefixes (`something`). - -### LEVENSHTEIN_MATCH() - -Introduced in: v3.7.0 - -`LEVENSHTEIN_MATCH(path, target, distance, transpositions, maxTerms, prefix) → fulfilled` - -Match documents with a [Damerau-Levenshtein distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) -lower than or equal to `distance` between the stored attribute value and -`target`. It can optionally match documents using a pure Levenshtein distance. - -See [LEVENSHTEIN_DISTANCE()](string.md#levenshtein_distance) -if you want to calculate the edit distance of two strings. - -- **path** (attribute path expression\|string): the path of the attribute to - compare against in the document or a string -- **target** (string): the string to compare against the stored attribute -- **distance** (number): the maximum edit distance, which can be between - `0` and `4` if `transpositions` is `false`, and between `0` and `3` if - it is `true` -- **transpositions** (bool, _optional_): if set to `false`, a Levenshtein - distance is computed, otherwise a Damerau-Levenshtein distance (default) -- **maxTerms** (number, _optional_): consider only a specified number of the - most relevant terms. One can pass `0` to consider all matched terms, but it may - impact performance negatively. The default value is `64`. -- returns **fulfilled** (bool): `true` if the calculated distance is less than - or equal to *distance*, `false` otherwise -- **prefix** (string, _optional_): if defined, then a search for the exact - prefix is carried out, using the matches as candidates. The Levenshtein / - Damerau-Levenshtein distance is then computed for each candidate using - the `target` value and the remainders of the strings, which means that the - **prefix needs to be removed from `target`** (see - [example](#example-matching-with-prefix-search)). This option can improve - performance in cases where there is a known common prefix. The default value - is an empty string (introduced in v3.7.13, v3.8.1). - -#### Example: Matching with and without transpositions - -The Levenshtein distance between _quick_ and _quikc_ is `2` because it requires -two operations to go from one to the other (remove _k_, insert _k_ at a -different position). - -```aql -FOR doc IN viewName - SEARCH LEVENSHTEIN_MATCH(doc.text, "quikc", 2, false) // matches "quick" - RETURN doc.text -``` - -The Damerau-Levenshtein distance is `1` (move _k_ to the end). - -```aql -FOR doc IN viewName - SEARCH LEVENSHTEIN_MATCH(doc.text, "quikc", 1) // matches "quick" - RETURN doc.text -``` - -#### Example: Matching with prefix search - -Match documents with a Levenshtein distance of 1 with the prefix `qui`. The edit -distance is calculated using the search term `kc` (`quikc` with the prefix `qui` -removed) and the stored value without the prefix (e.g. `ck`). The prefix `qui` -is constant. - -```aql -FOR doc IN viewName - SEARCH LEVENSHTEIN_MATCH(doc.text, "kc", 1, false, 64, "qui") // matches "quick" - RETURN doc.text -``` - -You may compute the prefix and suffix from the input string as follows: - -```aql -LET input = "quikc" -LET prefixSize = 3 -LET prefix = LEFT(input, prefixSize) -LET suffix = SUBSTRING(input, prefixSize) -FOR doc IN viewName - SEARCH LEVENSHTEIN_MATCH(doc.text, suffix, 1, false, 64, prefix) // matches "quick" - RETURN doc.text -``` - -#### Example: Basing the edit distance on string length - -You may want to pick the maximum edit distance based on string length. -If the stored attribute is the string _quick_ and the target string is -_quicksands_, then the Levenshtein distance is 5, with 50% of the -characters mismatching. If the inputs are _q_ and _qu_, then the distance -is only 1, although it is also a 50% mismatch. - -```aql -LET target = "input" -LET targetLength = LENGTH(target) -LET maxDistance = (targetLength > 5 ? 2 : (targetLength >= 3 ? 1 : 0)) -FOR doc IN viewName - SEARCH LEVENSHTEIN_MATCH(doc.text, target, maxDistance, true) - RETURN doc.text -``` - -### LIKE() - -Introduced in: v3.7.2 - -`LIKE(path, search) → bool` - -Check whether the pattern `search` is contained in the attribute denoted by `path`, -using wildcard matching. - -- `_`: A single arbitrary character -- `%`: Zero, one or many arbitrary characters -- `\\_`: A literal underscore -- `\\%`: A literal percent sign - -{{< info >}} -Literal backlashes require different amounts of escaping depending on the -context: -- `\` in bind variables (_Table_ view mode) in the web interface (automatically - escaped to `\\` unless the value is wrapped in double quotes and already - escaped properly) -- `\\` in bind variables (_JSON_ view mode) and queries in the web interface -- `\\` in bind variables in arangosh -- `\\\\` in queries in arangosh -- Double the amount compared to arangosh in shells that use backslashes for -escaping (`\\\\` in bind variables and `\\\\\\\\` in queries) -{{< /info >}} - -Searching with the `LIKE()` function in the context of a `SEARCH` operation -is backed by View indexes. The [String `LIKE()` function](string.md#like) -is used in other contexts such as in `FILTER` operations and cannot be -accelerated by any sort of index on the other hand. Another difference is that -the ArangoSearch variant does not accept a third argument to enable -case-insensitive matching. This can be controlled with Analyzers instead. - -- **path** (attribute path expression): the path of the attribute to compare - against in the document -- **search** (string): a search pattern that can contain the wildcard characters - `%` (meaning any sequence of characters, including none) and `_` (any single - character). Literal `%` and `_` must be escaped with backslashes. -- returns **bool** (bool): `true` if the pattern is contained in `text`, - and `false` otherwise - -#### Example: Searching with wildcards - -```aql -FOR doc IN viewName - SEARCH ANALYZER(LIKE(doc.text, "foo%b_r"), "text_en") - RETURN doc.text -``` - -`LIKE` can also be used in operator form: - -```aql -FOR doc IN viewName - SEARCH ANALYZER(doc.text LIKE "foo%b_r", "text_en") - RETURN doc.text -``` - -## Geo functions - -The following functions can be accelerated by View indexes. There are -corresponding [Geo Functions](geo.md) for the regular geo index -type, but also general purpose functions such as GeoJSON constructors that can -be used in conjunction with ArangoSearch. - -### GEO_CONTAINS() - -Introduced in: v3.8.0 - -`GEO_CONTAINS(geoJsonA, geoJsonB) → bool` - -Checks whether the [GeoJSON object](geo.md#geojson) `geoJsonA` -fully contains `geoJsonB` (every point in B is also in A). - -- **geoJsonA** (object\|array): first GeoJSON object or coordinate array - (in longitude, latitude order) -- **geoJsonB** (object\|array): second GeoJSON object or coordinate array - (in longitude, latitude order) -- returns **bool** (bool): `true` when every point in B is also contained in A, - `false` otherwise - -### GEO_DISTANCE() - -Introduced in: v3.8.0 - -`GEO_DISTANCE(geoJsonA, geoJsonB) → distance` - -Return the distance between two [GeoJSON objects](geo.md#geojson), -measured from the `centroid` of each shape. - -- **geoJsonA** (object\|array): first GeoJSON object or coordinate array - (in longitude, latitude order) -- **geoJsonB** (object\|array): second GeoJSON object or coordinate array - (in longitude, latitude order) -- returns **distance** (number): the distance between the centroid points of - the two objects on the reference ellipsoid - -### GEO_IN_RANGE() - -Introduced in: v3.8.0 - -`GEO_IN_RANGE(geoJsonA, geoJsonB, low, high, includeLow, includeHigh) → bool` - -Checks whether the distance between two [GeoJSON objects](geo.md#geojson) -lies within a given interval. The distance is measured from the `centroid` of -each shape. - -- **geoJsonA** (object\|array): first GeoJSON object or coordinate array - (in longitude, latitude order) -- **geoJsonB** (object\|array): second GeoJSON object or coordinate array - (in longitude, latitude order) -- **low** (number): minimum value of the desired range -- **high** (number): maximum value of the desired range -- **includeLow** (bool, optional): whether the minimum value shall be included - in the range (left-closed interval) or not (left-open interval). The default - value is `true` -- **includeHigh** (bool): whether the maximum value shall be included in the - range (right-closed interval) or not (right-open interval). The default value - is `true` -- returns **bool** (bool): whether the evaluated distance lies within the range - -### GEO_INTERSECTS() - -Introduced in: v3.8.0 - -`GEO_INTERSECTS(geoJsonA, geoJsonB) → bool` - -Checks whether the [GeoJSON object](geo.md#geojson) `geoJsonA` -intersects with `geoJsonB` (i.e. at least one point of B is in A or vice versa). - -- **geoJsonA** (object\|array): first GeoJSON object or coordinate array - (in longitude, latitude order) -- **geoJsonB** (object\|array): second GeoJSON object or coordinate array - (in longitude, latitude order) -- returns **bool** (bool): `true` if A and B intersect, `false` otherwise - -## Scoring Functions - -Scoring functions return a ranking value for the documents found by a -[SEARCH operation](../high-level-operations/search.md). The better the documents match -the search expression the higher the returned number. - -The first argument to any scoring function is always the document emitted by -a `FOR` operation over an `arangosearch` View. - -To sort the result set by relevance, with the more relevant documents coming -first, sort in **descending order** by the score (e.g. `SORT BM25(...) DESC`). - -You may calculate custom scores based on a scoring function using document -attributes and numeric functions (e.g. `TFIDF(doc) * LOG(doc.value)`): - -```aql -FOR movie IN imdbView - SEARCH PHRASE(movie.title, "Star Wars", "text_en") - SORT BM25(movie) * LOG(movie.runtime + 1) DESC - RETURN movie -``` - -Sorting by more than one score is allowed. You may also sort by a mix of -scores and attributes from multiple Views as well as collections: - -```aql -FOR a IN viewA - FOR c IN coll - FOR b IN viewB - SORT TFIDF(b), c.name, BM25(a) - ... -``` - -### BM25() - -`BM25(doc, k, b) → score` - -Sorts documents using the -[**Best Matching 25** algorithm](https://en.wikipedia.org/wiki/Okapi_BM25) -(Okapi BM25). - -- **doc** (document): must be emitted by `FOR ... IN viewName` -- **k** (number, _optional_): calibrates the text term frequency scaling. - The value needs to be non-negative (`0.0` or higher), or the returned - score is an undefined value that may cause unpredictable results. - The default is `1.2`. A `k` value of `0` corresponds to a binary model - (no term frequency), and a large value corresponds to using raw term frequency -- **b** (number, _optional_): determines the scaling by the total text length. - The value needs to be between `0.0` and `1.0` (inclusive), or the returned - score is an undefined value that may cause unpredictable results. - The default is `0.75`. At the extreme values of the coefficient `b`, BM25 - turns into the ranking functions known as: - - BM11 for `b` = `1` (corresponds to fully scaling the term weight by the - total text length) - - BM15 for `b` = `0` (corresponds to no length normalization) -- returns **score** (number): computed ranking value - -{{< info >}} -The Analyzers used for indexing document attributes must have the `"frequency"` -feature enabled. The `BM25()` function will otherwise return a score of 0. -The Analyzers should have the `"norm"` feature enabled, too, or normalization -will be disabled, which is not meaningful for BM25 and BM11. BM15 does not need -the `"norm"` feature as it has no length normalization. -{{< /info >}} - -#### Example: Sorting by default `BM25()` score - -Sorting by relevance with BM25 at default settings: - -```aql -FOR doc IN viewName - SEARCH ... - SORT BM25(doc) DESC - RETURN doc -``` - -#### Example: Sorting with tuned `BM25()` ranking - -Sorting by relevance, with double-weighted term frequency and with full text -length normalization: - -```aql -FOR doc IN viewName - SEARCH ... - SORT BM25(doc, 2.4, 1) DESC - RETURN doc -``` - -### TFIDF() - -`TFIDF(doc, normalize) → score` - -Sorts documents using the -[**term frequency–inverse document frequency** algorithm](https://en.wikipedia.org/wiki/TF-IDF) -(TF-IDF). - -- **doc** (document): must be emitted by `FOR ... IN viewName` -- **normalize** (bool, _optional_): specifies whether scores should be - normalized. The default is `false`. -- returns **score** (number): computed ranking value - -{{< info >}} -The Analyzers used for indexing document attributes must have the `"frequency"` -feature enabled. The `TFIDF()` function will otherwise return a score of 0. -The Analyzers need to have the `"norm"` feature enabled, too, if you want to use -`TFIDF()` with the `normalize` parameter set to `true`. -{{< /info >}} - -#### Example: Sorting by default `TFIDF()` score - -Sort by relevance using the TF-IDF score: - -```aql -FOR doc IN viewName - SEARCH ... - SORT TFIDF(doc) DESC - RETURN doc -``` - -#### Example: Sorting by `TFIDF()` score with normalization - -Sort by relevance using a normalized TF-IDF score: - -```aql -FOR doc IN viewName - SEARCH ... - SORT TFIDF(doc, true) DESC - RETURN doc -``` - -#### Example: Sort by value and `TFIDF()` - -Sort by the value of the `text` attribute in ascending order, then by the TFIDF -score in descending order where the attribute values are equivalent: - -```aql -FOR doc IN viewName - SEARCH ... - SORT doc.text, TFIDF(doc) DESC - RETURN doc -``` - -## Search Highlighting Functions - -{{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - -### OFFSET_INFO() - -`OFFSET_INFO(doc, paths) → offsetInfo` - -Returns the attribute paths and substring offsets of matched terms, phrases, or -_n_-grams for search highlighting purposes. - -- **doc** (document): must be emitted by `FOR ... IN viewName` -- **paths** (string\|array): a string or an array of strings, each describing an - attribute and array element path you want to get the offsets for. Use `.` to - access nested objects, and `[n]` with `n` being an array index to specify array - elements. The attributes need to be indexed by Analyzers with the `offset` - feature enabled. -- returns **offsetInfo** (array): an array of objects, limited to a default of - 10 offsets per path. Each object has the following attributes: - - **name** (array): the attribute and array element path as an array of - strings and numbers. You can pass this name to the - [`VALUE()` function](document-object.md) to dynamically look up the value. - - **offsets** (array): an array of arrays with the matched positions. Each - inner array has two elements with the start offset and the length of a match. - - {{< warning >}} - The offsets describe the positions in bytes, not characters. You may need - to account for characters encoded using multiple bytes. - {{< /warning >}} - ---- - -`OFFSET_INFO(doc, rules) → offsetInfo` - -- **doc** (document): must be emitted by `FOR ... IN viewName` -- **rules** (array): an array of objects with the following attributes: - - **name** (string): an attribute and array element path - you want to get the offsets for. Use `.` to access nested objects, - and `[n]` with `n` being an array index to specify array elements. The - attributes need to be indexed by Analyzers with the `offset` feature enabled. - - **options** (object): an object with the following attributes: - - **maxOffsets** (number, _optional_): the total number of offsets to - collect per path. Default: `10`. - - **limits** (object, _optional_): an object with the following attributes: - - **term** (number, _optional_): the total number of term offsets to - collect per path. Default: 232. - - **phrase** (number, _optional_): the total number of phrase offsets to - collect per path. Default: 232. - - **ngram** (number, _optional_): the total number of _n_-gram offsets to - collect per path. Default: 232. -- returns **offsetInfo** (array): an array of objects, each with the following - attributes: - - **name** (array): the attribute and array element path as an array of - strings and numbers. You can pass this name to the - [VALUE()](document-object.md) to dynamically look up the value. - - **offsets** (array): an array of arrays with the matched positions, capped - to the specified limits. Each inner array has two elements with the start - offset and the length of a match. - - {{< warning >}} - The start offsets and lengths describe the positions in bytes, not characters. - You may need to account for characters encoded using multiple bytes. - {{< /warning >}} - -**Examples** - -Search a View and get the offset information for the matches: - -```js ---- -name: aqlOffsetInfo -description: '' ---- -~db._create("food"); -~db.food.save({ name: "avocado", description: { en: "The avocado is a medium-sized, evergreen tree, native to the Americas." } }); -~db.food.save({ name: "tomato", description: { en: "The tomato is the edible berry of the tomato plant." } }); -~var analyzers = require("@arangodb/analyzers"); -~var analyzer = analyzers.save("text_en_offset", "text", { locale: "en", stopwords: [] }, ["frequency", "norm", "position", "offset"]); -~db._createView("food_view", "arangosearch", { links: { food: { fields: { description: { fields: { en: { analyzers: ["text_en_offset"] } } } } } } }); -~assert(db._query(`FOR d IN food_view COLLECT WITH COUNT INTO c RETURN c`).toArray()[0] === 2); -db._query(` - FOR doc IN food_view - SEARCH ANALYZER(TOKENS("avocado tomato", "text_en_offset") ANY == doc.description.en, "text_en_offset") - RETURN OFFSET_INFO(doc, ["description.en"])`); -~db._dropView("food_view"); -~db._drop("food"); -~analyzers.remove(analyzer.name); -``` - -For full examples, see [Search Highlighting](../../index-and-search/arangosearch/search-highlighting.md). diff --git a/site/content/3.10/aql/functions/geo.md b/site/content/3.10/aql/functions/geo.md deleted file mode 100644 index b35d8c375b..0000000000 --- a/site/content/3.10/aql/functions/geo.md +++ /dev/null @@ -1,966 +0,0 @@ ---- -title: Geo-spatial functions in AQL -menuTitle: Geo -weight: 35 -description: >- - AQL supports functions for geo-spatial queries and a subset of calls can be - accelerated by geo-spatial indexes ---- -## Geo-spatial data representations - -You can model geo-spatial information in different ways using the data types -available in ArangoDB. The recommended way is to use objects with **GeoJSON** -geometry but you can also use **longitude and latitude coordinate pairs** -for points. Both models are supported by -[Geo-Spatial Indexes](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md). - -### Coordinate pairs - -Longitude and latitude coordinates are numeric values and can be stored in the -following ways: - -- Coordinates using an array with two numbers in `[longitude, latitude]` order, - for example, in a user-chosen attribute called `location`: - - ```json - { - "location": [ -73.983, 40.764 ] - } - ``` - -- Coordinates using an array with two numbers in `[latitude, longitude]` order, - for example, in a user-chosen attribute called `location`: - - ```json - { - "location": [ 40.764, -73.983 ] - } - ``` - -- Coordinates using two separate numeric attributes, for example, in two - user-chosen attributes called `lat` and `lng` as sub-attributes of a `location` - attribute: - - ```json - { - "location": { - "lat": 40.764, - "lng": -73.983 - } - } - ``` - -### GeoJSON - -GeoJSON is a geospatial data format based on JSON. It defines several different -types of JSON objects and the way in which they can be combined to represent -data about geographic shapes on the Earth surface. - -Example of a document with a GeoJSON Point stored in a user-chosen attribute -called `location` (with coordinates in `[longitude, latitude]` order): - -```json -{ - "location": { - "type": "Point", - "coordinates": [ -73.983, 40.764 ] - } -} -``` - -GeoJSON uses a geographic coordinate reference system, -World Geodetic System 1984 (WGS 84), and units of decimal degrees. - -Internally ArangoDB maps all coordinate pairs onto a unit sphere. Distances are -projected onto a sphere with the Earth's *Volumetric mean radius* of *6371 -km*. ArangoDB implements a useful subset of the GeoJSON format -[(RFC 7946)](https://tools.ietf.org/html/rfc7946). -Feature Objects and the GeometryCollection type are not supported. -Supported geometry object types are: - -- Point -- MultiPoint -- LineString -- MultiLineString -- Polygon -- MultiPolygon - -#### Point - -A [GeoJSON Point](https://tools.ietf.org/html/rfc7946#section-3.1.2) is a -[position](https://tools.ietf.org/html/rfc7946#section-3.1.1) comprised of -a longitude and a latitude: - -```json -{ - "type": "Point", - "coordinates": [100.0, 0.0] -} -``` - -#### MultiPoint - -A [GeoJSON MultiPoint](https://tools.ietf.org/html/rfc7946#section-3.1.7) is -an array of positions: - -```json -{ - "type": "MultiPoint", - "coordinates": [ - [100.0, 0.0], - [101.0, 1.0] - ] -} -``` - -#### LineString - -A [GeoJSON LineString](https://tools.ietf.org/html/rfc7946#section-3.1.4) is -an array of two or more positions: - -```json -{ - "type": "LineString", - "coordinates": [ - [100.0, 0.0], - [101.0, 1.0] - ] -} -``` - -#### MultiLineString - -A [GeoJSON MultiLineString](https://tools.ietf.org/html/rfc7946#section-3.1.5) is -an array of LineString coordinate arrays: - -```json -{ - "type": "MultiLineString", - "coordinates": [ - [ - [100.0, 0.0], - [101.0, 1.0] - ], - [ - [102.0, 2.0], - [103.0, 3.0] - ] - ] -} -``` - -#### Polygon - -A [GeoJSON Polygon](https://tools.ietf.org/html/rfc7946#section-3.1.6) consists -of a series of closed `LineString` objects (ring-like). These *Linear Ring* -objects consist of four or more coordinate pairs with the first and last -coordinate pair being equal. Coordinate pairs of a Polygon are an array of -linear ring coordinate arrays. The first element in the array represents -the exterior ring. Any subsequent elements represent interior rings -(holes within the surface). - -The orientation of the first linear ring is crucial: the right-hand-rule -is applied, so that the area to the left of the path of the linear ring -(when walking on the surface of the Earth) is considered to be the -"interior" of the polygon. All other linear rings must be contained -within this interior. According to the GeoJSON standard, the subsequent -linear rings must be oriented following the right-hand-rule, too, -that is, they must run **clockwise** around the hole (viewed from -above). However, ArangoDB is tolerant here (as suggested by the -[GeoJSON standard](https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.6)), -all but the first linear ring are inverted if the orientation is wrong. - -In the end, a point is considered to be in the interior of the polygon, -if and only if one has to cross an odd number of linear rings to reach the -exterior of the polygon prescribed by the first linear ring. - -A number of additional rules apply (and are enforced by the GeoJSON -parser): - -- A polygon must contain at least one linear ring, i.e., it must not be - empty. -- A linear ring may not be empty, it needs at least three _distinct_ - coordinate pairs, that is, at least 4 coordinate pairs (since the first and - last must be the same). -- No two edges of linear rings in the polygon must intersect, in - particular, no linear ring may be self-intersecting. -- Within the same linear ring, consecutive coordinate pairs may be the same, - otherwise all coordinate pairs need to be distinct (except the first and last one). -- Linear rings of a polygon must not share edges, but they may share coordinate pairs. -- A linear ring defines two regions on the sphere. ArangoDB always - interprets the region that lies to the left of the boundary ring (in - the direction of its travel on the surface of the Earth) as the - interior of the ring. This is in contrast to earlier versions of - ArangoDB before 3.10, which always took the **smaller** of the two - regions as the interior. Therefore, from 3.10 on one can now have - polygons whose outer ring encloses more than half the Earth's surface. -- The interior rings must be contained in the (interior) of the outer ring. -- Interior rings should follow the above rule for orientation - (counterclockwise external rings, clockwise internal rings, interior - always to the left of the line). - -Here is an example with no holes: - -```json -{ - "type": "Polygon", - "coordinates": [ - [ - [100.0, 0.0], - [101.0, 0.0], - [101.0, 1.0], - [100.0, 1.0], - [100.0, 0.0] - ] - ] -} -``` - -Here is an example with a hole: - -```json -{ - "type": "Polygon", - "coordinates": [ - [ - [100.0, 0.0], - [101.0, 0.0], - [101.0, 1.0], - [100.0, 1.0], - [100.0, 0.0] - ], - [ - [100.8, 0.8], - [100.8, 0.2], - [100.2, 0.2], - [100.2, 0.8], - [100.8, 0.8] - ] - ] -} -``` - -#### MultiPolygon - -A [GeoJSON MultiPolygon](https://tools.ietf.org/html/rfc7946#section-3.1.6) consists -of multiple polygons. The "coordinates" member is an array of -_Polygon_ coordinate arrays. See [above](#polygon) for the rules and -the meaning of polygons. - -If the polygons in a MultiPolygon are disjoint, then a point is in the -interior of the MultiPolygon if and only if it is -contained in one of the polygons. If some polygon P2 in a MultiPolygon -is contained in another polygon P1, then P2 is treated like a hole -in P1 and containment of points is defined with the even-odd-crossings rule -(see [Polygon](#polygon)). - -Additionally, the following rules apply and are enforced for -MultiPolygons: - -- No two edges in the linear rings of the polygons of a MultiPolygon - may intersect. -- Polygons in the same MultiPolygon may not share edges, but they may share - coordinate pairs. - -Example with two polygons, the second one with a hole: - -```json -{ - "type": "MultiPolygon", - "coordinates": [ - [ - [ - [102.0, 2.0], - [103.0, 2.0], - [103.0, 3.0], - [102.0, 3.0], - [102.0, 2.0] - ] - ], - [ - [ - [100.0, 0.0], - [101.0, 0.0], - [101.0, 1.0], - [100.0, 1.0], - [100.0, 0.0] - ], - [ - [100.2, 0.2], - [100.2, 0.8], - [100.8, 0.8], - [100.8, 0.2], - [100.2, 0.2] - ] - ] - ] -} -``` - -## GeoJSON interpretation - -Note the following technical detail about GeoJSON: The -[GeoJSON standard, Section 3.1.1 Position](https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.1) -prescribes that lines are **cartesian lines in cylindrical coordinates** -(longitude/latitude). However, this definition is inconvenient in practice, -since such lines are not geodesic on the surface of the Earth. -Furthermore, the best available algorithms for geospatial computations on Earth -typically use geodesic lines as the boundaries of polygons on Earth. - -Therefore, ArangoDB uses the **syntax of the GeoJSON** standard, -but then interprets lines (and boundaries of polygons) as -**geodesic lines (pieces of great circles) on Earth**. This is a -violation of the GeoJSON standard, but serving a practical purpose. - -Note in particular that this can sometimes lead to unexpected results. -Consider the following polygon (remember that GeoJSON has -**longitude before latitude** in coordinate pairs): - -```json -{ "type": "Polygon", "coordinates": [[ - [4, 54], [4, 47], [16, 47], [16, 54], [4, 54] -]] } -``` - -![GeoJSON Polygon Geodesic](../../../images/geojson-polygon-geodesic.webp) - -It does not contain the point `[10, 47]` since the shortest path (geodesic) -from `[4, 47]` to `[16, 47]` lies North relative to the parallel of latitude at -47 degrees. On the contrary, the polygon does contain the point `[10, 54]` as it -lies South of the parallel of latitude at 54 degrees. - -{{< info >}} -ArangoDB version before 3.10 did an inconsistent special detection of "rectangle" -polygons that later versions from 3.10 onward no longer do, see -[Legacy Polygons](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md#legacy-polygons). -{{< /info >}} - -Furthermore, there is an issue with the interpretation of linear rings -(boundaries of polygons) according to -[GeoJSON standard, Section 3.1.6 Polygon](https://datatracker.ietf.org/doc/html/rfc7946#section-3.1.6). -This section states explicitly: - -> A linear ring MUST follow the right-hand rule with respect to the -> area it bounds, i.e., exterior rings are counter-clockwise, and -> holes are clockwise. - -This rather misleading phrase means that when a linear ring is used as -the boundary of a polygon, the "interior" of the polygon lies **to the left** -of the boundary when one travels on the surface of the Earth and -along the linear ring. For -example, the polygon below travels **counter-clockwise** around the point -`[10, 50]`, and thus the interior of the polygon contains this point and -its surroundings, but not, for example, the North Pole and the South -Pole. - -```json -{ "type": "Polygon", "coordinates": [[ - [4, 54], [4, 47], [16, 47], [16, 54], [4, 54] -]] } -``` - -![GeoJSON Polygon Counter-clockwise](../../../images/geojson-polygon-ccw.webp) - -On the other hand, the following polygon travels **clockwise** around the point -`[10, 50]`, and thus its "interior" does not contain `[10, 50]`, but does -contain the North Pole and the South Pole: - -```json -{ "type": "Polygon", "coordinates": [[ - [4, 54], [16, 54], [16, 47], [4, 47], [4, 54] -]] } -``` - -![GeoJSON Polygon Clockwise](../../../images/geojson-polygon-cw.webp) - -Remember that the "interior" is to the left of the given -linear ring, so this second polygon is basically the complement on Earth -of the previous polygon! - -ArangoDB versions before 3.10 did not follow this rule and always took the -"smaller" connected component of the surface as the "interior" of the polygon. -This made it impossible to specify polygons which covered more than half of the -sphere. From version 3.10 onward, ArangoDB recognizes this correctly. -See [Legacy Polygons](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md#legacy-polygons) -for how to deal with this issue. - -## Geo utility functions - -The following helper functions **can** use geo indexes, but do not have to in -all cases. You can use all of these functions in combination with each other, -and if you have configured a geo index it may be utilized, -see [Geo Indexing](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md). - -### DISTANCE() - -`DISTANCE(latitude1, longitude1, latitude2, longitude2) → distance` - -Calculate the distance between two arbitrary points in meters (as birds -would fly). The value is computed using the haversine formula, which is based -on a spherical Earth model. It's fast to compute and is accurate to around 0.3%, -which is sufficient for most use cases such as location-aware services. - -- **latitude1** (number): the latitude of the first point -- **longitude1** (number): the longitude of the first point -- **latitude2** (number): the latitude of the second point -- **longitude2** (number): the longitude of the second point -- returns **distance** (number): the distance between both points in **meters** - -```aql -// Distance from Brandenburg Gate (Berlin) to ArangoDB headquarters (Cologne) -DISTANCE(52.5163, 13.3777, 50.9322, 6.94) // 476918.89688380965 (~477km) - -// Sort a small number of documents based on distance to Central Park (New York) -FOR doc IN coll // e.g. documents returned by a traversal - SORT DISTANCE(doc.latitude, doc.longitude, 40.78, -73.97) - RETURN doc -``` - -### GEO_CONTAINS() - -`GEO_CONTAINS(geoJsonA, geoJsonB) → bool` - -Checks whether the [GeoJSON object](#geojson) `geoJsonA` -fully contains `geoJsonB` (every point in B is also in A). The object `geoJsonA` -has to be of type _Polygon_ or _MultiPolygon_. For other types containment is -not well-defined because of numerical stability problems. - -- **geoJsonA** (object): first GeoJSON object -- **geoJsonB** (object): second GeoJSON object, or a coordinate array in - `[longitude, latitude]` order -- returns **bool** (bool): `true` if every point in B is also contained in A, - otherwise `false` - -{{< info >}} -ArangoDB follows and exposes the same behavior as the underlying -S2 geometry library. As stated in the S2 documentation: - -> Point containment is defined such that if the sphere is subdivided -> into faces (loops), every point is contained by exactly one face. -> This implies that linear rings do not necessarily contain their vertices. - -As a consequence, a linear ring or polygon does not necessarily contain its -boundary edges! -{{< /info >}} - -You can optimize queries that contain a `FILTER` expression of the following -form with an S2-based [geospatial index](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md): - -```aql -FOR doc IN coll - FILTER GEO_CONTAINS(geoJson, doc.geo) - ... -``` - -In this example, you would create the index for the collection `coll`, on the -attribute `geo`. You need to set the `geoJson` index option to `true`. -The `geoJson` variable needs to evaluate to a valid GeoJSON object. Also note -the argument order: the stored document attribute `doc.geo` is passed as the -second argument. Passing it as the first argument, like -`FILTER GEO_CONTAINS(doc.geo, geoJson)` to test whether `doc.geo` contains -`geoJson`, cannot utilize the index. - -### GEO_DISTANCE() - -`GEO_DISTANCE(geoJsonA, geoJsonB, ellipsoid) → distance` - -Return the distance between two GeoJSON objects in meters, measured from the -**centroid** of each shape. For a list of supported types see the -[geo index page](#geojson). - -- **geoJsonA** (object): first GeoJSON object, or a coordinate array in - `[longitude, latitude]` order -- **geoJsonB** (object): second GeoJSON object, or a coordinate array in - `[longitude, latitude]` order -- **ellipsoid** (string, *optional*): reference ellipsoid to use. - Supported are `"sphere"` (default) and `"wgs84"`. -- returns **distance** (number): the distance between the centroid points of - the two objects on the reference ellipsoid in **meters** - -```aql -LET polygon = { - type: "Polygon", - coordinates: [[[-11.5, 23.5], [-10.5, 26.1], [-11.2, 27.1], [-11.5, 23.5]]] -} -FOR doc IN collectionName - LET distance = GEO_DISTANCE(doc.geometry, polygon) // calculates the distance - RETURN distance -``` - -You can optimize queries that contain a `FILTER` expression of the following -form with an S2-based [geospatial index](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md): - -```aql -FOR doc IN coll - FILTER GEO_DISTANCE(geoJson, doc.geo) <= limit - ... -``` - -In this example, you would create the index for the collection `coll`, on the -attribute `geo`. You need to set the `geoJson` index option to `true`. -`geoJson` needs to evaluate to a valid GeoJSON object. `limit` must be a -distance in meters; it cannot be an expression. An upper bound with `<`, -a lower bound with `>` or `>=`, or both, are equally supported. - -You can also optimize queries that use a `SORT` condition of the following form -with a geospatial index: - -```aql - SORT GEO_DISTANCE(geoJson, doc.geo) -``` - -The index covers returning matches from closest to furthest away, or vice versa. -You may combine such a `SORT` with a `FILTER` expression that utilizes the -geospatial index, too, via the [`GEO_DISTANCE()`](#geo_distance), -[`GEO_CONTAINS()`](#geo_contains), and [`GEO_INTERSECTS()`](#geo_intersects) -functions. - -### GEO_AREA() - -Introduced in: v3.5.1 - -`GEO_AREA(geoJson, ellipsoid) → area` - -Return the area for a [Polygon](#polygon) or [MultiPolygon](#multipolygon) -on a sphere with the average Earth radius, or an ellipsoid. - -- **geoJson** (object): a GeoJSON object -- **ellipsoid** (string, *optional*): reference ellipsoid to use. - Supported are `"sphere"` (default) and `"wgs84"`. -- returns **area** (number): the area of the polygon in **square meters** - -```aql -LET polygon = { - type: "Polygon", - coordinates: [[[-11.5, 23.5], [-10.5, 26.1], [-11.2, 27.1], [-11.5, 23.5]]] -} -RETURN GEO_AREA(polygon, "wgs84") -``` - -### GEO_EQUALS() - -`GEO_EQUALS(geoJsonA, geoJsonB) → bool` - -Checks whether two [GeoJSON objects](#geojson) are equal or not. - -- **geoJsonA** (object): first GeoJSON object. -- **geoJsonB** (object): second GeoJSON object. -- returns **bool** (bool): `true` if they are equal, otherwise `false`. - -```aql -LET polygonA = GEO_POLYGON([ - [-11.5, 23.5], [-10.5, 26.1], [-11.2, 27.1], [-11.5, 23.5] -]) -LET polygonB = GEO_POLYGON([ - [-11.5, 23.5], [-10.5, 26.1], [-11.2, 27.1], [-11.5, 23.5] -]) -RETURN GEO_EQUALS(polygonA, polygonB) // true -``` - -```aql -LET polygonA = GEO_POLYGON([ - [-11.1, 24.0], [-10.5, 26.1], [-11.2, 27.1], [-11.1, 24.0] -]) -LET polygonB = GEO_POLYGON([ - [-11.5, 23.5], [-10.5, 26.1], [-11.2, 27.1], [-11.5, 23.5] -]) -RETURN GEO_EQUALS(polygonA, polygonB) // false -``` - -### GEO_INTERSECTS() - -`GEO_INTERSECTS(geoJsonA, geoJsonB) → bool` - -Checks whether the [GeoJSON object](#geojson) `geoJsonA` -intersects with `geoJsonB` (i.e. at least one point in B is also in A or vice-versa). - -- **geoJsonA** (object): first GeoJSON object -- **geoJsonB** (object): second GeoJSON object, or a coordinate array in - `[longitude, latitude]` order -- returns **bool** (bool): true if B intersects A, false otherwise - -You can optimize queries that contain a `FILTER` expression of the following -form with an S2-based [geospatial index](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md): - -```aql -FOR doc IN coll - FILTER GEO_INTERSECTS(geoJson, doc.geo) - ... -``` - -In this example, you would create the index for the collection `coll`, on the -attribute `geo`. You need to set the `geoJson` index option to `true`. -`geoJson` needs to evaluate to a valid GeoJSON object. Also note -the argument order: the stored document attribute `doc.geo` is passed as the -second argument. Passing it as the first argument, like -`FILTER GEO_INTERSECTS(doc.geo, geoJson)` to test whether `doc.geo` intersects -`geoJson`, cannot utilize the index. - -### GEO_IN_RANGE() - -Introduced in: v3.8.0 - -`GEO_IN_RANGE(geoJsonA, geoJsonB, low, high, includeLow, includeHigh) → bool` - -Checks whether the distance between two [GeoJSON objects](#geojson) -lies within a given interval. The distance is measured from the **centroid** of -each shape. - -- **geoJsonA** (object\|array): first GeoJSON object, or a coordinate array - in `[longitude, latitude]` order -- **geoJsonB** (object\|array): second GeoJSON object, or a coordinate array - in `[longitude, latitude]` order -- **low** (number): minimum value of the desired range -- **high** (number): maximum value of the desired range -- **includeLow** (bool, optional): whether the minimum value shall be included - in the range (left-closed interval) or not (left-open interval). The default - value is `true` -- **includeHigh** (bool): whether the maximum value shall be included in the - range (right-closed interval) or not (right-open interval). The default value - is `true` -- returns **bool** (bool): whether the evaluated distance lies within the range - -### IS_IN_POLYGON() - -Determine whether a point is inside a polygon. - -{{< warning >}} -The `IS_IN_POLYGON()` AQL function is **deprecated** as of ArangoDB 3.4.0 in -favor of the new [`GEO_CONTAINS()` AQL function](#geo_contains), which works with -[GeoJSON](https://tools.ietf.org/html/rfc7946) Polygons and MultiPolygons. -{{< /warning >}} - -`IS_IN_POLYGON(polygon, latitude, longitude) → bool` - -- **polygon** (array): an array of arrays with 2 elements each, representing the - points of the polygon in the format `[latitude, longitude]` -- **latitude** (number): the latitude of the point to search -- **longitude** (number): the longitude of the point to search -- returns **bool** (bool): `true` if the point (`[latitude, longitude]`) is - inside the `polygon` or `false` if it's not. The result is undefined (can be - `true` or `false`) if the specified point is exactly on a boundary of the - polygon. - -```aql -// checks if the point (latitude 4, longitude 7) is contained inside the polygon -IS_IN_POLYGON( [ [ 0, 0 ], [ 0, 10 ], [ 10, 10 ], [ 10, 0 ] ], 4, 7 ) -``` - ---- - -`IS_IN_POLYGON(polygon, coord, useLonLat) → bool` - -The 2nd parameter can alternatively be specified as an array with two values. - -By default, each array element in `polygon` is expected to be in the format -`[latitude, longitude]`. This can be changed by setting the 3rd parameter to `true` to -interpret the points as `[longitude, latitude]`. `coord` is then also interpreted in -the same way. - -- **polygon** (array): an array of arrays with 2 elements each, representing the - points of the polygon -- **coord** (array): the point to search as a numeric array with two elements -- **useLonLat** (bool, *optional*): if set to `true`, the coordinates in - `polygon` and the coordinate pair `coord` are interpreted as - `[longitude, latitude]` (like in GeoJSON). The default is `false` and the - format `[latitude, longitude]` is expected. -- returns **bool** (bool): `true` if the point `coord` is inside the `polygon` - or `false` if it's not. The result is undefined (can be `true` or `false`) if - the specified point is exactly on a boundary of the polygon. - -```aql -// checks if the point (lat 4, lon 7) is contained inside the polygon -IS_IN_POLYGON( [ [ 0, 0 ], [ 0, 10 ], [ 10, 10 ], [ 10, 0 ] ], [ 4, 7 ] ) - -// checks if the point (lat 4, lon 7) is contained inside the polygon -IS_IN_POLYGON( [ [ 0, 0 ], [ 10, 0 ], [ 10, 10 ], [ 0, 10 ] ], [ 7, 4 ], true ) -``` - -## GeoJSON Constructors - -The following helper functions are available to easily create valid GeoJSON -output. In all cases you can write equivalent JSON yourself, but these functions -will help you to make all your AQL queries shorter and easier to read. - -### GEO_LINESTRING() - -`GEO_LINESTRING(points) → geoJson` - -Construct a GeoJSON LineString. -Needs at least two longitude/latitude pairs. - -- **points** (array): an array of `[longitude, latitude]` pairs -- returns **geoJson** (object): a valid GeoJSON LineString - -```aql ---- -name: aqlGeoLineString_1 -description: '' ---- -RETURN GEO_LINESTRING([ - [35, 10], [45, 45] -]) -``` - -### GEO_MULTILINESTRING() - -`GEO_MULTILINESTRING(points) → geoJson` - -Construct a GeoJSON MultiLineString. -Needs at least two elements consisting valid LineStrings coordinate arrays. - -- **points** (array): array of LineStrings -- returns **geoJson** (object): a valid GeoJSON MultiLineString - -```aql ---- -name: aqlGeoMultiLineString_1 -description: '' ---- -RETURN GEO_MULTILINESTRING([ - [[100.0, 0.0], [101.0, 1.0]], - [[102.0, 2.0], [101.0, 2.3]] -]) -``` - -### GEO_MULTIPOINT() - -`GEO_MULTIPOINT(points) → geoJson` - -Construct a GeoJSON LineString. Needs at least two longitude/latitude pairs. - -- **points** (array): an array of `[longitude, latitude]` pairs -- returns **geoJson** (object): a valid GeoJSON Point - -```aql ---- -name: aqlGeoMultiPoint_1 -description: '' ---- -RETURN GEO_MULTIPOINT([ - [35, 10], [45, 45] -]) -``` - -### GEO_POINT() - -`GEO_POINT(longitude, latitude) → geoJson` - -Construct a valid GeoJSON Point. - -- **longitude** (number): the longitude portion of the point -- **latitude** (number): the latitude portion of the point -- returns **geoJson** (object): a GeoJSON Point - -```aql ---- -name: aqlGeoPoint_1 -description: '' ---- -RETURN GEO_POINT(1.0, 2.0) -``` - -### GEO_POLYGON() - -`GEO_POLYGON(points) → geoJson` - -Construct a GeoJSON Polygon. Needs at least one array representing -a linear ring. Each linear ring consists of an array with at least four -longitude/latitude pairs. The first linear ring must be the outermost, while -any subsequent linear ring will be interpreted as holes. - -For details about the rules, see [GeoJSON Polygon](#polygon). - -- **points** (array): an array of (arrays of) `[longitude, latitude]` pairs -- returns **geoJson** (object\|null): a valid GeoJSON Polygon - -A validation step is performed using the S2 geometry library. If the -validation is not successful, an AQL warning is issued and `null` is -returned. - -Simple Polygon: - -```aql ---- -name: aqlGeoPolygon_1 -description: '' ---- -RETURN GEO_POLYGON([ - [0.0, 0.0], [7.5, 2.5], [0.0, 5.0], [0.0, 0.0] -]) -``` - -Advanced Polygon with a hole inside: - -```aql ---- -name: aqlGeoPolygon_2 -description: '' ---- -RETURN GEO_POLYGON([ - [[35, 10], [45, 45], [15, 40], [10, 20], [35, 10]], - [[20, 30], [30, 20], [35, 35], [20, 30]] -]) -``` - -### GEO_MULTIPOLYGON() - -`GEO_MULTIPOLYGON(polygons) → geoJson` - -Construct a GeoJSON MultiPolygon. Needs at least two Polygons inside. -See [GEO_POLYGON()](#geo_polygon) and [GeoJSON MultiPolygon](#multipolygon) -for the rules of Polygon and MultiPolygon construction. - -- **polygons** (array): an array of arrays of arrays of `[longitude, latitude]` pairs -- returns **geoJson** (object\|null): a valid GeoJSON MultiPolygon - -A validation step is performed using the S2 geometry library, if the -validation is not successful, an AQL warning is issued and `null` is -returned. - -MultiPolygon comprised of a simple Polygon and a Polygon with hole: - -```aql ---- -name: aqlGeoMultiPolygon_1 -description: '' ---- -RETURN GEO_MULTIPOLYGON([ - [ - [[40, 40], [20, 45], [45, 30], [40, 40]] - ], - [ - [[20, 35], [10, 30], [10, 10], [30, 5], [45, 20], [20, 35]], - [[30, 20], [20, 15], [20, 25], [30, 20]] - ] -]) -``` - -## Geo Index Functions - -{{< warning >}} -The AQL functions `NEAR()`, `WITHIN()` and `WITHIN_RECTANGLE()` are -deprecated starting from version 3.4.0. -Please use the [Geo utility functions](#geo-utility-functions) instead. -{{< /warning >}} - -AQL offers the following functions to filter data based on -[geo indexes](../../index-and-search/indexing/working-with-indexes/geo-spatial-indexes.md). These functions require the collection -to have at least one geo index. If no geo index can be found, calling this -function will fail with an error at runtime. There is no error when explaining -the query however. - -### NEAR() - -{{< warning >}} -`NEAR()` is a deprecated AQL function from version 3.4.0 on. -Use [`DISTANCE()`](#distance) in a query like this instead: - -```aql -FOR doc IN coll - SORT DISTANCE(doc.latitude, doc.longitude, paramLatitude, paramLongitude) ASC - RETURN doc -``` -Assuming there exists a geo-type index on `latitude` and `longitude`, the -optimizer will recognize it and accelerate the query. -{{< /warning >}} - -`NEAR(coll, latitude, longitude, limit, distanceName) → docArray` - -Return at most *limit* documents from collection *coll* that are near -*latitude* and *longitude*. The result contains at most *limit* documents, -returned sorted by distance, with closest distances being returned first. -Optionally, the distances in meters between the specified coordinate pair -(*latitude* and *longitude*) and the stored coordinate pairs can be returned as -well. To make use of that, the desired attribute name for the distance result -has to be specified in the *distanceName* argument. The result documents will -contain the distance value in an attribute of that name. - -- **coll** (collection): a collection -- **latitude** (number): the latitude of the point to search -- **longitude** (number): the longitude of the point to search -- **limit** (number, *optional*): cap the result to at most this number of - documents. The default is 100. If more documents than *limit* are found, - it is undefined which ones will be returned. -- **distanceName** (string, *optional*): include the distance (in meters) - between the reference point and the stored point in the result, using the - attribute name *distanceName* -- returns **docArray** (array): an array of documents, sorted by distance - (shortest distance first) - -### WITHIN() - -{{< warning >}} -`WITHIN()` is a deprecated AQL function from version 3.4.0 on. -Use [`DISTANCE()`](#distance) in a query like this instead: - -```aql -FOR doc IN coll - LET d = DISTANCE(doc.latitude, doc.longitude, paramLatitude, paramLongitude) - FILTER d <= radius - SORT d ASC - RETURN doc -``` - -Assuming there exists a geo-type index on `latitude` and `longitude`, the -optimizer will recognize it and accelerate the query. -{{< /warning >}} - -`WITHIN(coll, latitude, longitude, radius, distanceName) → docArray` - -Return all documents from collection *coll* that are within a radius of *radius* -around the specified coordinate pair (*latitude* and *longitude*). The documents -returned are sorted by distance to the reference point, with the closest -distances being returned first. Optionally, the distance (in meters) between the -reference point and the stored point can be returned as well. To make -use of that, an attribute name for the distance result has to be specified in -the *distanceName* argument. The result documents will contain the distance -value in an attribute of that name. - -- **coll** (collection): a collection -- **latitude** (number): the latitude of the point to search -- **longitude** (number): the longitude of the point to search -- **radius** (number): radius in meters -- **distanceName** (string, *optional*): include the distance (in meters) - between the reference point and stored point in the result, using the - attribute name *distanceName* -- returns **docArray** (array): an array of documents, sorted by distance - (shortest distance first) - -### WITHIN_RECTANGLE() - -{{< warning >}} -`WITHIN_RECTANGLE()` is a deprecated AQL function from version 3.4.0 on. Use -[`GEO_CONTAINS()`](#geo_contains) and a GeoJSON polygon instead - but note that -this uses geodesic lines from version 3.10.0 onward -(see [GeoJSON interpretation](#geojson-interpretation)): - -```aql -LET rect = GEO_POLYGON([ [ - [longitude1, latitude1], // bottom-left - [longitude2, latitude1], // bottom-right - [longitude2, latitude2], // top-right - [longitude1, latitude2], // top-left - [longitude1, latitude1], // bottom-left -] ]) -FOR doc IN coll - FILTER GEO_CONTAINS(rect, [doc.longitude, doc.latitude]) - RETURN doc -``` - -Assuming there exists a geo-type index on `latitude` and `longitude`, the -optimizer will recognize it and accelerate the query. -{{< /warning >}} - -`WITHIN_RECTANGLE(coll, latitude1, longitude1, latitude2, longitude2) → docArray` - -Return all documents from collection *coll* that are positioned inside the -bounding rectangle with the points (*latitude1*, *longitude1*) and (*latitude2*, -*longitude2*). There is no guaranteed order in which the documents are returned. - -- **coll** (collection): a collection -- **latitude1** (number): the latitude of the bottom-left point to search -- **longitude1** (number): the longitude of the bottom-left point to search -- **latitude2** (number): the latitude of the top-right point to search -- **longitude2** (number): the longitude of the top-right point to search -- returns **docArray** (array): an array of documents, in random order diff --git a/site/content/3.10/aql/graphs/all-shortest-paths.md b/site/content/3.10/aql/graphs/all-shortest-paths.md deleted file mode 100644 index 1dc67cc001..0000000000 --- a/site/content/3.10/aql/graphs/all-shortest-paths.md +++ /dev/null @@ -1,197 +0,0 @@ ---- -title: All Shortest Paths in AQL -menuTitle: All Shortest Paths -weight: 20 -description: >- - Find all paths of shortest length between a start and target vertex ---- -## General query idea - -This type of query finds all paths of shortest length between two given -documents (*startVertex* and *targetVertex*) in your graph. - -Every returned path is a JSON object with two attributes: - -- An array containing the `vertices` on the path. -- An array containing the `edges` on the path. - -**Example** - -A visual representation of the example graph: - -![Train Connection Map](../../../images/train_map.png) - -Each ellipse stands for a train station with the name of the city written inside -of it. They are the vertices of the graph. Arrows represent train connections -between cities and are the edges of the graph. - -Assuming that you want to go from **Carlisle** to **London** by train, the -expected two shortest paths are: - -1. Carlisle – Birmingham – London -2. Carlisle – York – London - -Another path that connects Carlisle and London is -Carlisle – Glasgow – Edinburgh – York – London, but it is has two more stops and -is therefore not a path of the shortest length. - -## Syntax - -The syntax for All Shortest Paths queries is similar to the one for -[Shortest Path](shortest-path.md) and there are also two options to -either use a named graph or a set of edge collections. It only emits a path -variable however, whereas `SHORTEST_PATH` emits a vertex and an edge variable. - -### Working with named graphs - -```aql -FOR path - IN OUTBOUND|INBOUND|ANY ALL_SHORTEST_PATHS - startVertex TO targetVertex - GRAPH graphName -``` - -- `FOR`: emits the variable **path** which contains one shortest path as an - object, with the `vertices` and `edges` of the path. -- `IN` `OUTBOUND|INBOUND|ANY`: defines in which direction - edges are followed (outgoing, incoming, or both) -- `ALL_SHORTEST_PATHS`: the keyword to compute All Shortest Paths -- **startVertex** `TO` **targetVertex** (both string\|object): the two vertices between - which the paths will be computed. This can be specified in the form of - a ID string or in the form of a document with the attribute `_id`. All other - values result in a warning and an empty result. If one of the specified - documents does not exist, the result is empty as well and there is no warning. -- `GRAPH` **graphName** (string): the name identifying the named graph. Its vertex and - edge collections will be looked up. - -{{< info >}} -All Shortest Paths traversals do not support edge weights. -{{< /info >}} - -### Working with collection sets - -```aql -FOR path - IN OUTBOUND|INBOUND|ANY ALL_SHORTEST_PATHS - startVertex TO targetVertex - edgeCollection1, ..., edgeCollectionN -``` - -Instead of `GRAPH graphName` you can specify a list of edge collections. -The involved vertex collections are determined by the edges of the given -edge collections. - -### Traversing in mixed directions - -For All Shortest Paths with a list of edge collections, you can optionally specify the -direction for some of the edge collections. Say, for example, you have three edge -collections *edges1*, *edges2* and *edges3*, where in *edges2* the direction -has no relevance, but in *edges1* and *edges3* the direction should be taken into -account. In this case you can use `OUTBOUND` as a general search direction and `ANY` -specifically for *edges2* as follows: - -```aql -FOR path IN OUTBOUND ALL_SHORTEST_PATHS - startVertex TO targetVertex - edges1, ANY edges2, edges3 -``` - -All collections in the list that do not specify their own direction will use the -direction defined after `IN` (here: `OUTBOUND`). This allows using a different -direction for each collection in your path search. - -## Examples - -Load an example graph to get a named graph that reflects some possible -train connections in Europe and North America: - -![Train Connection Map](../../../images/train_map.png) - -```js ---- -name: GRAPHASP_01_create_graph -description: '' ---- -~addIgnoreCollection("places"); -~addIgnoreCollection("connections"); -var examples = require("@arangodb/graph-examples/example-graph"); -var graph = examples.loadGraph("kShortestPathsGraph"); -db.places.toArray(); -db.connections.toArray(); -``` - -Suppose you want to query a route from **Carlisle** to **London**, and -compare the outputs of `SHORTEST_PATH`, `K_SHORTEST_PATHS` and `ALL_SHORTEST_PATHS`. -Note that `SHORTEST_PATH` returns any of the shortest paths, whereas -`ALL_SHORTEST_PATHS` returns all of them. `K_SHORTEST_PATHS` returns the -shortest paths first but continues with longer paths, until it found all routes -or reaches the defined limit (the number of paths). - -Using `SHORTEST_PATH` to get one shortest path: - -```aql ---- -name: GRAPHASP_01_Carlisle_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e IN OUTBOUND SHORTEST_PATH 'places/Carlisle' TO 'places/London' -GRAPH 'kShortestPathsGraph' - RETURN { place: v.label } -``` - -Using `ALL_SHORTEST_PATHS` to get both shortest paths: - -```aql ---- -name: GRAPHASP_02_Carlisle_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND ALL_SHORTEST_PATHS 'places/Carlisle' TO 'places/London' -GRAPH 'kShortestPathsGraph' - RETURN { places: p.vertices[*].label } -``` - -Using `K_SHORTEST_PATHS` without a limit to get all paths in order of -increasing length: - -```aql ---- -name: GRAPHASP_03_Carlisle_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND K_SHORTEST_PATHS 'places/Carlisle' TO 'places/London' -GRAPH 'kShortestPathsGraph' - RETURN { places: p.vertices[*].label } -``` - -If you ask for routes that don't exist, you get an empty result -(from **Carlisle** to **Toronto**): - -```aql ---- -name: GRAPHASP_04_Carlisle_to_Toronto -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND ALL_SHORTEST_PATHS 'places/Carlisle' TO 'places/Toronto' -GRAPH 'kShortestPathsGraph' - RETURN { - places: p.vertices[*].label - } -``` - -And finally clean up by removing the named graph: - -```js ---- -name: GRAPHASP_99_drop_graph -description: '' ---- -var examples = require("@arangodb/graph-examples/example-graph"); -examples.dropGraph("kShortestPathsGraph"); -~removeIgnoreCollection("places"); -~removeIgnoreCollection("connections"); -``` diff --git a/site/content/3.10/aql/graphs/k-paths.md b/site/content/3.10/aql/graphs/k-paths.md deleted file mode 100644 index d7e6aabe2a..0000000000 --- a/site/content/3.10/aql/graphs/k-paths.md +++ /dev/null @@ -1,237 +0,0 @@ ---- -title: k Paths in AQL -menuTitle: k Paths -weight: 30 -description: >- - Determine all paths between a start and end vertex limited specified path - lengths ---- -## General query idea - -This type of query finds all paths between two given documents, -*startVertex* and *targetVertex* in your graph. The paths are restricted -by minimum and maximum length of the paths. - -Every such path will be returned as a JSON object with two components: - -- an array containing the `vertices` on the path -- an array containing the `edges` on the path - -**Example** - -Let us take a look at a simple example to explain how it works. -This is the graph that we are going to find some paths on: - -![Train Connection Map](../../../images/train_map.png) - -Each ellipse stands for a train station with the name of the city written inside -of it. They are the vertices of the graph. Arrows represent train connections -between cities and are the edges of the graph. The numbers near the arrows -describe how long it takes to get from one station to another. They are used -as edge weights. - -Let us assume that we want to go from **Aberdeen** to **London** by train. - -Here we have a couple of alternatives: - -a) Straight way - - 1. Aberdeen - 2. Leuchars - 3. Edinburgh - 4. York - 5. London - -b) Detour at York - - 1. Aberdeen - 2. Leuchars - 3. Edinburgh - 4. York - 5. **Carlisle** - 6. **Birmingham** - 7. London - -c) Detour at Edinburgh - - 1. Aberdeen - 2. Leuchars - 3. Edinburgh - 4. **Glasgow** - 5. **Carlisle** - 6. **Birmingham** - 7. London - -d) Detour at Edinburgh to York - - 1. Aberdeen - 2. Leuchars - 3. Edinburgh - 4. **Glasgow** - 5. **Carlisle** - 6. York - 7. London - -Note that we only consider paths as valid that do not contain the same vertex -twice. The following alternative would visit Aberdeen twice and will not be returned by k Paths: - -1. Aberdeen -2. **Inverness** -3. **Aberdeen** -4. Leuchars -5. Edinburgh -6. York -7. London - -## Example Use Cases - -The use-cases for k Paths are about the same as for unweighted k Shortest Paths. -The main difference is that k Shortest Paths will enumerate all paths with -**increasing length**. It will stop as soon as a given limit is reached. -k Paths will instead only enumerate **all paths** within a given range of -path length, and are thereby upper-bounded. - -The k Paths traversal can be used as foundation for several other algorithms: - -- **Transportation** of any kind (e.g. road traffic, network package routing) -- **Flow problems**: We need to transfer items from A to B, which alternatives - do we have? What is their capacity? - -## Syntax - -The syntax for k Paths queries is similar to the one for -[K Shortest Path](k-shortest-paths.md) with the addition to define the -minimum and maximum length of the path. - -{{< warning >}} -It is highly recommended that you use a reasonable maximum path length or a -**LIMIT** statement, as k Paths is a potentially expensive operation. On large -connected graphs it can return a large number of paths. -{{< /warning >}} - -### Working with named graphs - -```aql -FOR path - IN MIN..MAX OUTBOUND|INBOUND|ANY K_PATHS - startVertex TO targetVertex - GRAPH graphName - [OPTIONS options] -``` - -- `FOR`: emits the variable **path** which contains one path as an object - containing `vertices` and `edges` of the path. -- `IN` `MIN..MAX`: the minimal and maximal depth for the traversal: - - **min** (number, *optional*): paths returned by this query will - have at least a length of *min* many edges. - If not specified, it defaults to 1. The minimal possible value is 0. - - **max** (number, *optional*): paths returned by this query will - have at most a length of *max* many edges. - If omitted, *max* defaults to *min*. Thus only the vertices and edges in - the range of *min* are returned. *max* cannot be specified without *min*. -- `OUTBOUND|INBOUND|ANY`: defines in which direction - edges are followed (outgoing, incoming, or both) -- `K_PATHS`: the keyword to compute all Paths -- **startVertex** `TO` **targetVertex** (both string\|object): the two vertices - between which the paths will be computed. This can be specified in the form of - a document identifier string or in the form of an object with the attribute - `_id`. All other values will lead to a warning and an empty result. This is - also the case if one of the specified documents does not exist. -- `GRAPH` **graphName** (string): the name identifying the named graph. - Its vertex and edge collections will be looked up. -- `OPTIONS` **options** (object, *optional*): used to modify the execution of - the search. Right now there are no options that trigger an effect. - However, this may change in the future. - -### Working with collection sets - -```aql -FOR path - IN MIN..MAX OUTBOUND|INBOUND|ANY K_PATHS - startVertex TO targetVertex - edgeCollection1, ..., edgeCollectionN - [OPTIONS options] -``` - -Instead of `GRAPH graphName` you can specify a list of edge collections. -The involved vertex collections are determined by the edges of the given -edge collections. - -### Traversing in mixed directions - -For k paths with a list of edge collections you can optionally specify the -direction for some of the edge collections. Say for example you have three edge -collections *edges1*, *edges2* and *edges3*, where in *edges2* the direction -has no relevance, but in *edges1* and *edges3* the direction should be taken -into account. In this case you can use `OUTBOUND` as general search direction -and `ANY` specifically for *edges2* as follows: - -```aql -FOR vertex IN OUTBOUND K_PATHS - startVertex TO targetVertex - edges1, ANY edges2, edges3 -``` - -All collections in the list that do not specify their own direction will use the -direction defined after `IN` (here: `OUTBOUND`). This allows to use a different -direction for each collection in your path search. - -## Examples - -We load an example graph to get a named graph that reflects some possible -train connections in Europe and North America. - -![Train Connection Map](../../../images/train_map.png) - -```js ---- -name: GRAPHKP_01_create_graph -description: '' ---- -~addIgnoreCollection("places"); -~addIgnoreCollection("connections"); -var examples = require("@arangodb/graph-examples/example-graph"); -var graph = examples.loadGraph("kShortestPathsGraph"); -db.places.toArray(); -db.connections.toArray(); -``` - -Suppose we want to query all routes from **Aberdeen** to **London**. - -```aql ---- -name: GRAPHKP_01_Aberdeen_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN 1..10 OUTBOUND K_PATHS 'places/Aberdeen' TO 'places/London' -GRAPH 'kShortestPathsGraph' - RETURN { places: p.vertices[*].label, travelTimes: p.edges[*].travelTime } -``` - -If we ask for routes that don't exist we get an empty result -(from **Aberdeen** to **Toronto**): - -```aql ---- -name: GRAPHKP_02_Aberdeen_to_Toronto -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN 1..10 OUTBOUND K_PATHS 'places/Aberdeen' TO 'places/Toronto' -GRAPH 'kShortestPathsGraph' - RETURN { places: p.vertices[*].label, travelTimes: p.edges[*].travelTime } -``` - -And finally clean up by removing the named graph: - -```js ---- -name: GRAPHKP_99_drop_graph -description: '' ---- -var examples = require("@arangodb/graph-examples/example-graph"); -examples.dropGraph("kShortestPathsGraph"); -~removeIgnoreCollection("places"); -~removeIgnoreCollection("connections"); -``` diff --git a/site/content/3.10/aql/graphs/k-shortest-paths.md b/site/content/3.10/aql/graphs/k-shortest-paths.md deleted file mode 100644 index bb2ba93017..0000000000 --- a/site/content/3.10/aql/graphs/k-shortest-paths.md +++ /dev/null @@ -1,295 +0,0 @@ ---- -title: k Shortest Paths in AQL -menuTitle: k Shortest Paths -weight: 25 -description: >- - Determine a specified number of shortest paths in increasing path length or - weight order ---- -## General query idea - -This type of query finds the first *k* paths in order of length -(or weight) between two given documents, *startVertex* and *targetVertex* in -your graph. - -Every such path will be returned as a JSON object with three components: - -- an array containing the `vertices` on the path -- an array containing the `edges` on the path -- the `weight` of the path, that is the sum of all edge weights - -If no `weightAttribute` is given, the weight of the path is just its length. - -{{< youtube id="XdITulJFdVo" >}} - -**Example** - -Let us take a look at a simple example to explain how it works. -This is the graph that we are going to find some shortest path on: - -![Train Connection Map](../../../images/train_map.png) - -Each ellipse stands for a train station with the name of the city written inside -of it. They are the vertices of the graph. Arrows represent train connections -between cities and are the edges of the graph. The numbers near the arrows -describe how long it takes to get from one station to another. They are used -as edge weights. - -Let us assume that we want to go from **Aberdeen** to **London** by train. - -We expect to see the following vertices on *the* shortest path, in this order: - -1. Aberdeen -2. Leuchars -3. Edinburgh -4. York -5. London - -By the way, the weight of the path is: 1.5 + 1.5 + 3.5 + 1.8 = **8.3**. - -Let us look at alternative paths next, for example because we know that the -direct connection between York and London does not operate currently. -An alternative path, which is slightly longer, goes like this: - -1. Aberdeen -2. Leuchars -3. Edinburgh -4. York -5. **Carlisle** -6. **Birmingham** -7. London - -Its weight is: 1.5 + 1.5 + 3.5 + 2.0 + 1.5 = **10.0**. - -Another route goes via Glasgow. There are seven stations on the path as well, -however, it is quicker if we compare the edge weights: - -1. Aberdeen -2. Leuchars -3. Edinburgh -4. **Glasgow** -5. Carlisle -6. Birmingham -7. London - -The path weight is lower: 1.5 + 1.5 + 1.0 + 1.0 + 2.0 + 1.5 = **8.5**. - -## Syntax - -The syntax for k Shortest Paths queries is similar to the one for -[Shortest Path](shortest-path.md) and there are also two options to -either use a named graph or a set of edge collections. It only emits a path -variable however, whereas `SHORTEST_PATH` emits a vertex and an edge variable. - -{{< warning >}} -It is highly recommended that you use a **LIMIT** statement, as -k Shortest Paths is a potentially expensive operation. On large connected -graphs it can return a large number of paths, or perform an expensive -(but unsuccessful) search for more short paths. -{{< /warning >}} - -### Working with named graphs - -```aql -FOR path - IN OUTBOUND|INBOUND|ANY K_SHORTEST_PATHS - startVertex TO targetVertex - GRAPH graphName - [OPTIONS options] - [LIMIT offset, count] -``` - -- `FOR`: emits the variable **path** which contains one path as an object containing - `vertices`, `edges`, and the `weight` of the path. -- `IN` `OUTBOUND|INBOUND|ANY`: defines in which direction - edges are followed (outgoing, incoming, or both) -- `K_SHORTEST_PATHS`: the keyword to compute k Shortest Paths -- **startVertex** `TO` **targetVertex** (both string\|object): the two vertices between - which the paths will be computed. This can be specified in the form of - a ID string or in the form of a document with the attribute `_id`. All other - values will lead to a warning and an empty result. If one of the specified - documents does not exist, the result is empty as well and there is no warning. -- `GRAPH` **graphName** (string): the name identifying the named graph. Its vertex and - edge collections will be looked up. -- `OPTIONS` **options** (object, *optional*): used to modify the execution of the - traversal. Only the following attributes have an effect, all others are ignored: - - **weightAttribute** (string): a top-level edge attribute that should be used - to read the edge weight. If the attribute does not exist or is not numeric, the - *defaultWeight* will be used instead. The attribute value must not be negative. - - **defaultWeight** (number): this value will be used as fallback if there is - no *weightAttribute* in the edge document, or if it's not a number. The value - must not be negative. The default is `1`. -- `LIMIT` (see [LIMIT operation](../high-level-operations/limit.md), *optional*): - the maximal number of paths to return. It is highly recommended to use - a `LIMIT` for `K_SHORTEST_PATHS`. - -{{< info >}} -k Shortest Paths traversals do not support negative weights. If a document -attribute (as specified by `weightAttribute`) with a negative value is -encountered during traversal, or if `defaultWeight` is set to a negative -number, then the query is aborted with an error. -{{< /info >}} - -### Working with collection sets - -```aql -FOR path - IN OUTBOUND|INBOUND|ANY K_SHORTEST_PATHS - startVertex TO targetVertex - edgeCollection1, ..., edgeCollectionN - [OPTIONS options] - [LIMIT offset, count] -``` - -Instead of `GRAPH graphName` you can specify a list of edge collections. -The involved vertex collections are determined by the edges of the given -edge collections. - -### Traversing in mixed directions - -For k shortest paths with a list of edge collections you can optionally specify the -direction for some of the edge collections. Say for example you have three edge -collections *edges1*, *edges2* and *edges3*, where in *edges2* the direction -has no relevance, but in *edges1* and *edges3* the direction should be taken into -account. In this case you can use `OUTBOUND` as general search direction and `ANY` -specifically for *edges2* as follows: - -```aql -FOR vertex IN OUTBOUND K_SHORTEST_PATHS - startVertex TO targetVertex - edges1, ANY edges2, edges3 -``` - -All collections in the list that do not specify their own direction will use the -direction defined after `IN` (here: `OUTBOUND`). This allows to use a different -direction for each collection in your path search. - -## Examples - -We load an example graph to get a named graph that reflects some possible -train connections in Europe and North America. - -![Train Connection Map](../../../images/train_map.png) - -```js ---- -name: GRAPHKSP_01_create_graph -description: '' ---- -~addIgnoreCollection("places"); -~addIgnoreCollection("connections"); -var examples = require("@arangodb/graph-examples/example-graph"); -var graph = examples.loadGraph("kShortestPathsGraph"); -db.places.toArray(); -db.connections.toArray(); -``` - -Suppose we want to query a route from **Aberdeen** to **London**, and -compare the outputs of `SHORTEST_PATH` and `K_SHORTEST_PATHS` with -`LIMIT 1`. Note that while `SHORTEST_PATH` and `K_SHORTEST_PATH` with -`LIMIT 1` should return a path of the same length (or weight), they do -not need to return the same path. - -Using `SHORTEST_PATH`: - -```aql ---- -name: GRAPHKSP_01_Aberdeen_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e IN OUTBOUND SHORTEST_PATH 'places/Aberdeen' TO 'places/London' -GRAPH 'kShortestPathsGraph' - RETURN { place: v.label, travelTime: e.travelTime } -``` - -Using `K_SHORTEST_PATHS`: - -```aql ---- -name: GRAPHKSP_02_Aberdeen_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND K_SHORTEST_PATHS 'places/Aberdeen' TO 'places/London' -GRAPH 'kShortestPathsGraph' - LIMIT 1 - RETURN { places: p.vertices[*].label, travelTimes: p.edges[*].travelTime } -``` - -With `K_SHORTEST_PATHS` we can ask for more than one option for a route: - -```aql ---- -name: GRAPHKSP_03_Aberdeen_to_London -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND K_SHORTEST_PATHS 'places/Aberdeen' TO 'places/London' -GRAPH 'kShortestPathsGraph' - LIMIT 3 - RETURN { - places: p.vertices[*].label, - travelTimes: p.edges[*].travelTime, - travelTimeTotal: SUM(p.edges[*].travelTime) - } -``` - -If we ask for routes that don't exist we get an empty result -(from **Aberdeen** to **Toronto**): - -```aql ---- -name: GRAPHKSP_04_Aberdeen_to_Toronto -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND K_SHORTEST_PATHS 'places/Aberdeen' TO 'places/Toronto' -GRAPH 'kShortestPathsGraph' - LIMIT 3 - RETURN { - places: p.vertices[*].label, - travelTimes: p.edges[*].travelTime, - travelTimeTotal: SUM(p.edges[*].travelTime) - } -``` - -We can use the attribute *travelTime* that connections have as edge weights to -take into account which connections are quicker. A high default weight is set, -to be used if an edge has no *travelTime* attribute (not the case with the -example graph). This returns the top three routes with the fewest changes -and favoring the least travel time for the connection **Saint Andrews** -to **Cologne**: - -```aql ---- -name: GRAPHKSP_05_StAndrews_to_Cologne -description: '' -dataset: kShortestPathsGraph ---- -FOR p IN OUTBOUND K_SHORTEST_PATHS 'places/StAndrews' TO 'places/Cologne' -GRAPH 'kShortestPathsGraph' -OPTIONS { - weightAttribute: 'travelTime', - defaultWeight: 15 -} - LIMIT 3 - RETURN { - places: p.vertices[*].label, - travelTimes: p.edges[*].travelTime, - travelTimeTotal: SUM(p.edges[*].travelTime) - } -``` - -And finally clean up by removing the named graph: - -```js ---- -name: GRAPHKSP_99_drop_graph -description: '' ---- -var examples = require("@arangodb/graph-examples/example-graph"); -examples.dropGraph("kShortestPathsGraph"); -~removeIgnoreCollection("places"); -~removeIgnoreCollection("connections"); -``` diff --git a/site/content/3.10/aql/graphs/shortest-path.md b/site/content/3.10/aql/graphs/shortest-path.md deleted file mode 100644 index 29d689422b..0000000000 --- a/site/content/3.10/aql/graphs/shortest-path.md +++ /dev/null @@ -1,209 +0,0 @@ ---- -title: Shortest Path in AQL -menuTitle: Shortest Path -weight: 15 -description: >- - With the shortest path algorithm, you can find one shortest path between - two vertices using AQL ---- -## General query idea - -This type of query is supposed to find the shortest path between two given documents -(*startVertex* and *targetVertex*) in your graph. For all vertices on this shortest -path you will get a result in form of a set with two items: - -1. The vertex on this path. -2. The edge pointing to it. - -### Example execution - -Let's take a look at a simple example to explain how it works. -This is the graph that you are going to find a shortest path on: - -![traversal graph](../../../images/traversal_graph.png) - -You can use the following parameters for the query: - -1. You start at the vertex **A**. -2. You finish with the vertex **D**. - -So, obviously, you have the vertices **A**, **B**, **C** and **D** on the -shortest path in exactly this order. Then, the shortest path statement -returns the following pairs: - -| Vertex | Edge | -|--------|-------| -| A | null | -| B | A → B | -| C | B → C | -| D | C → D | - -Note that the first edge is always `null` because there is no edge pointing -to the *startVertex*. - -## Syntax - -The next step is to see how you can write a shortest path query. -You have two options here, you can either use a named graph or a set of edge -collections (anonymous graph). - -### Working with named graphs - -```aql -FOR vertex[, edge] - IN OUTBOUND|INBOUND|ANY SHORTEST_PATH - startVertex TO targetVertex - GRAPH graphName - [OPTIONS options] -``` - -- `FOR`: emits up to two variables: - - **vertex** (object): the current vertex on the shortest path - - **edge** (object, *optional*): the edge pointing to the vertex -- `IN` `OUTBOUND|INBOUND|ANY`: defines in which direction edges are followed - (outgoing, incoming, or both) -- **startVertex** `TO` **targetVertex** (both string\|object): the two vertices between - which the shortest path will be computed. This can be specified in the form of - an ID string or in the form of a document with the attribute `_id`. All other - values will lead to a warning and an empty result. If one of the specified - documents does not exist, the result is empty as well and there is no warning. -- `GRAPH` **graphName** (string): the name identifying the named graph. Its vertex and - edge collections will be looked up. -- `OPTIONS` **options** (object, *optional*): used to modify the execution of the - traversal. Only the following attributes have an effect, all others are ignored: - - **weightAttribute** (string): a top-level edge attribute that should be used - to read the edge weight. If the attribute is not existent or not numeric, the - *defaultWeight* will be used instead. The attribute value must not be negative. - - **defaultWeight** (number): this value will be used as fallback if there is - no *weightAttribute* in the edge document, or if it is not a number. - The value must not be negative. The default is `1`. - -{{< info >}} -Shortest Path traversals do not support negative weights. If a document -attribute (as specified by `weightAttribute`) with a negative value is -encountered during traversal, or if `defaultWeight` is set to a negative -number, then the query is aborted with an error. -{{< /info >}} - -### Working with collection sets - -```aql -FOR vertex[, edge] - IN OUTBOUND|INBOUND|ANY SHORTEST_PATH - startVertex TO targetVertex - edgeCollection1, ..., edgeCollectionN - [OPTIONS options] -``` - -Instead of `GRAPH graphName` you may specify a list of edge collections (anonymous -graph). The involved vertex collections are determined by the edges of the given -edge collections. The rest of the behavior is similar to the named version. - -### Traversing in mixed directions - -For shortest path with a list of edge collections you can optionally specify the -direction for some of the edge collections. Say for example you have three edge -collections *edges1*, *edges2* and *edges3*, where in *edges2* the direction -has no relevance, but in *edges1* and *edges3* the direction should be taken into -account. In this case you can use `OUTBOUND` as general search direction and `ANY` -specifically for *edges2* as follows: - -```aql -FOR vertex IN OUTBOUND SHORTEST_PATH - startVertex TO targetVertex - edges1, ANY edges2, edges3 -``` - -All collections in the list that do not specify their own direction will use the -direction defined after `IN` (here: `OUTBOUND`). This allows to use a different -direction for each collection in your path search. - -## Conditional shortest path - -The `SHORTEST_PATH` computation only finds an unconditioned shortest path. -With this construct it is not possible to define a condition like: "Find the -shortest path where all edges are of type *X*". If you want to do this, use a -normal [Traversal](traversals.md) instead with the option -`{order: "bfs"}` in combination with `LIMIT 1`. - -Please also consider using [`WITH`](../high-level-operations/with.md) to specify the -collections you expect to be involved. - -## Examples -Creating a simple symmetric traversal demonstration graph: - -![traversal graph](../../../images/traversal_graph.png) - -```js ---- -name: GRAPHSP_01_create_graph -description: '' ---- -~addIgnoreCollection("circles"); -~addIgnoreCollection("edges"); -var examples = require("@arangodb/graph-examples/example-graph"); -var graph = examples.loadGraph("traversalGraph"); -db.circles.toArray(); -db.edges.toArray(); -``` - -Start with the shortest path from **A** to **D** as above: - -```js ---- -name: GRAPHSP_02_A_to_D -description: '' ---- -db._query(` - FOR v, e IN OUTBOUND SHORTEST_PATH 'circles/A' TO 'circles/D' GRAPH 'traversalGraph' - RETURN [v._key, e._key] -`); - -db._query(` - FOR v, e IN OUTBOUND SHORTEST_PATH 'circles/A' TO 'circles/D' edges - RETURN [v._key, e._key] -`); -``` - -You can see that expectations are fulfilled. You find the vertices in the -correct ordering and the first edge is *null*, because no edge is pointing -to the start vertex on this path. - -You can also compute shortest paths based on documents found in collections: - -```js ---- -name: GRAPHSP_03_A_to_D -description: '' ---- -db._query(` - FOR a IN circles - FILTER a._key == 'A' - FOR d IN circles - FILTER d._key == 'D' - FOR v, e IN OUTBOUND SHORTEST_PATH a TO d GRAPH 'traversalGraph' - RETURN [v._key, e._key] -`); - -db._query(` - FOR a IN circles - FILTER a._key == 'A' - FOR d IN circles - FILTER d._key == 'D' - FOR v, e IN OUTBOUND SHORTEST_PATH a TO d edges - RETURN [v._key, e._key] -`); -``` - -And finally clean it up again: - -```js ---- -name: GRAPHSP_99_drop_graph -description: '' ---- -var examples = require("@arangodb/graph-examples/example-graph"); -examples.dropGraph("traversalGraph"); -~removeIgnoreCollection("circles"); -~removeIgnoreCollection("edges"); -``` diff --git a/site/content/3.10/aql/graphs/traversals-explained.md b/site/content/3.10/aql/graphs/traversals-explained.md deleted file mode 100644 index a211ae6087..0000000000 --- a/site/content/3.10/aql/graphs/traversals-explained.md +++ /dev/null @@ -1,85 +0,0 @@ ---- -title: AQL graph traversals explained -menuTitle: Traversals explained -weight: 5 -description: >- - Traversing a graph means to follow edges connected to a start vertex and - neighboring vertices until a specified depth ---- -## General query idea - -A traversal starts at one specific document (*startVertex*) and follows all -edges connected to this document. For all documents (*vertices*) that are -targeted by these edges it will again follow all edges connected to them and -so on. It is possible to define how many of these follow iterations should be -executed at least (*min* depth) and at most (*max* depth). - -For all vertices that were visited during this process in the range between -*min* depth and *max* depth iterations you will get a result in form of a -set with three items: - -1. The visited vertex. -2. The edge pointing to it. -3. The complete path from startVertex to the visited vertex as object with an - attribute *edges* and an attribute *vertices*, each a list of the corresponding - elements. These lists are sorted, which means the first element in *vertices* - is the *startVertex* and the last is the visited vertex, and the n-th element - in *edges* connects the n-th element with the (n+1)-th element in *vertices*. - -## Example execution - -Let's take a look at a simple example to explain how it works. -This is the graph that we are going to traverse: - -![traversal graph](../../../images/traversal_graph.png) - -We use the following parameters for our query: - -1. We start at the vertex **A**. -2. We use a *min* depth of 1. -3. We use a *max* depth of 2. -4. We follow only in `OUTBOUND` direction of edges - -![traversal graph step 1](../../../images/traversal_graph1.png) - -Now it walks to one of the direct neighbors of **A**, say **B** (note: ordering -is not guaranteed!): - -![traversal graph step 2](../../../images/traversal_graph2.png) - -The query will remember the state (red circle) and will emit the first result -**A** → **B** (black box). This will also prevent the traverser to be trapped -in cycles. Now again it will visit one of the direct neighbors of **B**, say **E**: - -![traversal graph step 3](../../../images/traversal_graph3.png) - -We have limited the query with a *max* depth of *2*, so it will not pick any -neighbor of **E**, as the path from **A** to **E** already requires *2* steps. -Instead, we will go back one level to **B** and continue with any other direct -neighbor there: - -![traversal graph step 4](../../../images/traversal_graph4.png) - -Again after we produced this result we will step back to **B**. -But there is no neighbor of **B** left that we have not yet visited. -Hence we go another step back to **A** and continue with any other neighbor there. - -![traversal graph step 5](../../../images/traversal_graph5.png) - -And identical to the iterations before we will visit **H**: - -![traversal graph step 6](../../../images/traversal_graph6.png) - -And **J**: - -![traversal graph step 7](../../../images/traversal_graph7.png) - -After these steps there is no further result left. So all together this query -has returned the following paths: - -1. **A** → **B** -2. **A** → **B** → **E** -3. **A** → **B** → **C** -4. **A** → **G** -5. **A** → **G** → **H** -6. **A** → **G** → **J** diff --git a/site/content/3.10/aql/graphs/traversals.md b/site/content/3.10/aql/graphs/traversals.md deleted file mode 100644 index 283703f0b7..0000000000 --- a/site/content/3.10/aql/graphs/traversals.md +++ /dev/null @@ -1,847 +0,0 @@ ---- -title: Graph traversals in AQL -menuTitle: Traversals -weight: 10 -description: >- - You can traverse named graphs and anonymous graphs with a native AQL - language construct ---- -## Syntax - -There are two slightly different syntaxes for traversals in AQL, one for -- [named graphs](../../graphs/_index.md#named-graphs) and another to -- specify a [set of edge collections](#working-with-collection-sets) - ([anonymous graph](../../graphs/_index.md#anonymous-graphs)). - -### Working with named graphs - -The syntax for AQL graph traversals using named graphs is as follows -(square brackets denote optional parts and `|` denotes alternatives): - -```aql -FOR vertex[, edge[, path]] - IN [min[..max]] - OUTBOUND|INBOUND|ANY startVertex - GRAPH graphName - [PRUNE [pruneVariable = ]pruneCondition] - [OPTIONS options] -``` - -- `FOR`: emits up to three variables: - - **vertex** (object): the current vertex in a traversal - - **edge** (object, *optional*): the current edge in a traversal - - **path** (object, *optional*): representation of the current path with - two members: - - `vertices`: an array of all vertices on this path - - `edges`: an array of all edges on this path -- `IN` `min..max`: the minimal and maximal depth for the traversal: - - **min** (number, *optional*): edges and vertices returned by this query - start at the traversal depth of *min* (thus edges and vertices below it are - not returned). If not specified, it defaults to 1. The minimal - possible value is 0. - - **max** (number, *optional*): up to *max* length paths are traversed. - If omitted, *max* defaults to *min*. Thus only the vertices and edges in - the range of *min* are returned. *max* cannot be specified without *min*. -- `OUTBOUND|INBOUND|ANY`: follow outgoing, incoming, or edges pointing in either - direction in the traversal. Note that this can't be replaced by a bind parameter. -- **startVertex** (string\|object): a vertex where the traversal originates from. - This can be specified in the form of an ID string or in the form of a document - with the `_id` attribute. All other values lead to a warning and an empty - result. If the specified document does not exist, the result is empty as well - and there is no warning. -- `GRAPH` **graphName** (string): the name identifying the named graph. - Its vertex and edge collections are looked up. Note that the graph name - is like a regular string, hence it must be enclosed by quote marks, like - `GRAPH "graphName"`. -- `PRUNE` **expression** (AQL expression, *optional*): - An expression, like in a `FILTER` statement, which is evaluated in every step of - the traversal, as early as possible. The semantics of this expression are as follows: - - If the expression evaluates to `false`, the traversal continues on the current path. - - If the expression evaluates to `true`, the traversal does not continue on the - current path. However, the paths up to this point are considered as a result - (they might still be post-filtered or ignored due to depth constraints). - For example, a traversal over the graph `(A) -> (B) -> (C)` starting at `A` - and pruning on `B` results in `(A)` and `(A) -> (B)` being valid paths, - whereas `(A) -> (B) -> (C)` is not returned because it gets pruned on `B`. - - You can only use a single `PRUNE` clause per `FOR` traversal operation, but - the prune expression can contain an arbitrary number of conditions using `AND` - and `OR` statements for complex expressions. You can use the variables emitted - by the `FOR` operation in the prune expression, as well as all variables - defined before the traversal. - - You can optionally assign the prune expression to a variable like - `PRUNE var = ` to use the evaluated result elsewhere in the query, - typically in a `FILTER` expression. - - See [Pruning](#pruning) for details. -- `OPTIONS` **options** (object, *optional*): used to modify the execution of the - traversal. Only the following attributes have an effect, all others are ignored: - - **order** (string): optionally specify which traversal algorithm to use - - `"bfs"` – the traversal is executed breadth-first. The results - first contain all vertices at depth 1, then all vertices at depth 2 and so on. - - `"dfs"` (default) – the traversal is executed depth-first. It - first returns all paths from *min* depth to *max* depth for one vertex at - depth 1, then for the next vertex at depth 1 and so on. - - `"weighted"` - the traversal is a weighted traversal - (introduced in v3.8.0). Paths are enumerated with increasing cost. - Also see `weightAttribute` and `defaultWeight`. A returned path has an - additional attribute `weight` containing the cost of the path after every - step. The order of paths having the same cost is non-deterministic. - Negative weights are not supported and abort the query with an error. - - **bfs** (bool): deprecated, use `order: "bfs"` instead. - - **uniqueVertices** (string): optionally ensure vertex uniqueness - - `"path"` – it is guaranteed that there is no path returned with a duplicate vertex - - `"global"` – it is guaranteed that each vertex is visited at most once during - the traversal, no matter how many paths lead from the start vertex to this one. - If you start with a `min depth > 1` a vertex that was found before *min* depth - might not be returned at all (it still might be part of a path). - It is required to set `order: "bfs"` or `order: "weighted"` because with - depth-first search the results would be unpredictable. **Note:** - Using this configuration the result is not deterministic any more. If there - are multiple paths from *startVertex* to *vertex*, one of those is picked. - In case of a `weighted` traversal, the path with the lowest weight is - picked, but in case of equal weights it is undefined which one is chosen. - - `"none"` (default) – no uniqueness check is applied on vertices - - **uniqueEdges** (string): optionally ensure edge uniqueness - - `"path"` (default) – it is guaranteed that there is no path returned with a - duplicate edge - - `"none"` – no uniqueness check is applied on edges. **Note:** - Using this configuration, the traversal follows edges in cycles. - - **edgeCollections** (string\|array): Optionally restrict edge - collections the traversal may visit (introduced in v3.7.0). If omitted, - or an empty array is specified, then there are no restrictions. - - A string parameter is treated as the equivalent of an array with a single - element. - - Each element of the array should be a string containing the name of an - edge collection. - - **vertexCollections** (string\|array): Optionally restrict vertex - collections the traversal may visit (introduced in v3.7.0). If omitted, - or an empty array is specified, then there are no restrictions. - - A string parameter is treated as the equivalent of an array with a single - element. - - Each element of the array should be a string containing the name of a - vertex collection. - - The starting vertex is always allowed, even if it does not belong to one - of the collections specified by a restriction. - - **parallelism** (number, *optional*): - - {{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - - Optionally parallelize traversal execution. If omitted or set to a value of `1`, - traversal execution is not parallelized. If set to a value greater than `1`, - then up to that many worker threads can be used for concurrently executing - the traversal. The value is capped by the number of available cores on the - target machine. - - Parallelizing a traversal is normally useful when there are many inputs (start - vertices) that the nested traversal can work on concurrently. This is often the - case when a nested traversal is fed with several tens of thousands of start - vertices, which can then be distributed randomly to worker threads for parallel - execution. - - **maxProjections** (number, *optional*): - - {{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - - Specifies the number of document attributes per `FOR` loop to be used as - projections. The default value is `5`. - - **weightAttribute** (string, *optional*): Specifies the name of an attribute - that is used to look up the weight of an edge. If no attribute is specified - or if it is not present in the edge document then the `defaultWeight` is used. - The attribute value must not be negative. - - **defaultWeight** (number, *optional*): Specifies the default weight of an edge. - The value must not be negative. The default value is `1`. - -{{< info >}} -Weighted traversals do not support negative weights. If a document -attribute (as specified by `weightAttribute`) with a negative value is -encountered during traversal, or if `defaultWeight` is set to a negative -number, then the query is aborted with an error. -{{< /info >}} - -### Working with collection sets - -The syntax for AQL graph traversals using collection sets is as follows -(square brackets denote optional parts and `|` denotes alternatives): - -```aql -[WITH vertexCollection1[, vertexCollection2[, vertexCollectionN]]] -FOR vertex[, edge[, path]] - IN [min[..max]] - OUTBOUND|INBOUND|ANY startVertex - edgeCollection1[, edgeCollection2[, edgeCollectionN]] - [PRUNE [pruneVariable = ]pruneCondition] - [OPTIONS options] -``` - -- `WITH`: Declaration of collections. Optional for single server instances, but - required for [graph traversals in a cluster](#graph-traversals-in-a-cluster). - Needs to be placed at the very beginning of the query. - - **collections** (collection, *repeatable*): list of vertex collections that - are involved in the traversal -- **edgeCollections** (collection, *repeatable*): One or more edge collections - to use for the traversal (instead of using a named graph with `GRAPH graphName`). - Vertex collections are determined by the edges in the edge collections. - - You can override the default traversal direction by setting `OUTBOUND`, - `INBOUND`, or `ANY` before any of the edge collections. - - If the same edge collection is specified multiple times, it behaves as if it - were specified only once. Specifying the same edge collection is only allowed - when the collections do not have conflicting traversal directions. - - Views cannot be used as edge collections. -- See the [named graph variant](#working-with-named-graphs) for the remaining - traversal parameters. The `edgeCollections` restriction option is redundant in - this case. - -### Traversing in mixed directions - -For traversals with a list of edge collections you can optionally specify the -direction for some of the edge collections. Say for example you have three edge -collections *edges1*, *edges2* and *edges3*, where in *edges2* the direction has -no relevance but in *edges1* and *edges3* the direction should be taken into account. -In this case you can use `OUTBOUND` as general traversal direction and `ANY` -specifically for *edges2* as follows: - -```aql -FOR vertex IN OUTBOUND - startVertex - edges1, ANY edges2, edges3 -``` - -All collections in the list that do not specify their own direction use the -direction defined after `IN`. This allows to use a different direction for each -collection in your traversal. - -### Graph traversals in a cluster - -Due to the nature of graphs, edges may reference vertices from arbitrary -collections. Following the paths can thus involve documents from various -collections and it is not possible to predict which are visited in a -traversal. Which collections need to be loaded by the graph engine can only be -determined at run time. - -Use the [`WITH` statement](../high-level-operations/with.md) to specify the collections you -expect to be involved. This is required for traversals using collection sets -in cluster deployments. - -## Pruning - -You can define stop conditions for graph traversals to return specific data and -to improve the query performance. This is called _pruning_ and works by checking -conditions during the traversal as opposed to filtering the results afterwards -(post-filtering). This reduces the amount of data to be checked by stopping the -traversal down specific paths early. - -{{< youtube id="4LVeeC0ciCQ" >}} - -You can specify one `PRUNE` expression per graph traversal, but it can contain -an arbitrary number of conditions. You can use the vertex, edge, and path -variables emitted by the traversal in a prune expression, as well as all other -variables defined before the `FOR` operation. Note that `PRUNE` is an optional -clause of the `FOR` operation and that the `OPTIONS` clause needs to be placed -after `PRUNE`. - -```aql ---- -name: GRAPHTRAV_graphPruneExample1 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 0..10 OUTBOUND "places/Toronto" GRAPH "kShortestPathsGraph" - PRUNE v.label == "Edmonton" - OPTIONS { uniqueVertices: "path" } - RETURN CONCAT_SEPARATOR(" -- ", p.vertices[*].label) -``` - -The above example shows a graph traversal using a -[train station and connections dataset](../../graphs/example-graphs.md#k-shortest-paths-graph): - -![Train Connection Map](../../../images/train_map.png) - -The traversal starts at **Toronto** (bottom left), the traversal depth is -limited to 10, and every station is only visited once. The traversal could -continue up to **Vancouver** (bottom right) at depth 5, but it is stopped early -on this path (the only path in this example) at **Edmonton** because of the -prune expression. - -The traversal along paths is stopped as soon as the prune expression evaluates -to `true` for a given path. The current depth is still included in the result, -however. This can be seen in the query result of the example which includes the -Edmonton vertex at which it stopped. - -The following example starts a traversal at **London** (middle right), with a -depth between 2 and 3, and every station is only visited once. The station names -as well as the travel times are returned: - -```aql ---- -name: GRAPHTRAV_graphPruneExample2 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - OPTIONS { uniqueVertices: "path" } - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -The same example with an added prune expression, with vertex and edge conditions: - -```aql ---- -name: GRAPHTRAV_graphPruneExample3 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - PRUNE v.label == "Carlisle" OR e.travelTime > 3 - OPTIONS { uniqueVertices: "path" } - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -If either the **Carlisle** vertex or an edge with a travel time of over three -hours is encountered, the subsequent paths are pruned. In the example, this -removes the train connections to **Birmingham**, **Glasgow**, and **York**, -which come after **Carlisle**, as well as the connections to and via -**Edinburgh** because of the four hour duration for the section from **York** -to **Edinburgh**. - -If your graph is comprised of multiple vertex or edge collections, you can -also prune as soon as you reach a certain collection, using a condition like -`PRUNE IS_SAME_COLLECTION("stopCollection", v)`. - -If you want to only return the results of the depth at which the traversal -stopped due to the prune expression, you can use a `FILTER` in addition. You can -assign the evaluated result of a prune expression to a variable -(`PRUNE var = `) and use it for filtering: - -```aql ---- -name: GRAPHTRAV_graphPruneExample4 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - PRUNE cond = v.label == "Carlisle" OR e.travelTime > 3 - OPTIONS { uniqueVertices: "path" } - FILTER cond - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -Only paths that end at **Carlisle** or with the last edge having a travel time -of over three hours are returned. This excludes the connection to **Cologne** -from the results compared to the previous query. - -If you want to exclude the depth at which the prune expression stopped the -traversal, you can assign the expression to a variable and use its negated value -in a `FILTER`: - -```aql ---- -name: GRAPHTRAV_graphPruneExample5 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - PRUNE cond = v.label == "Carlisle" OR e.travelTime > 3 - OPTIONS { uniqueVertices: "path" } - FILTER NOT cond - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -This only returns the connection to **Cologne**, which is the opposite of the -previous example. - -You may combine the prune variable with arbitrary other conditions in a `FILTER` -operation. For example, you can remove results where the last edge has as lower -travel time than the second to last edge of the path: - -```aql ---- -name: GRAPHTRAV_graphPruneExample6 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 2..5 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - PRUNE cond = v.label == "Carlisle" OR e.travelTime > 3 - OPTIONS { uniqueVertices: "path" } - FILTER cond AND p.edges[-1].travelTime >= p.edges[-2].travelTime - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -{{< info >}} -The prune expression is **evaluated at every step of the traversal**. This -includes any traversal depths below the specified minimum depth, despite not -becoming part of the result. It also includes depth 0, which is the start vertex -and a `null` edge. - -If you add prune conditions using the edge variable, make sure to account for -the edge at depth 0 being `null`, as it may accidentally stop the traversal -immediately. This may not be apparent due to the depth constraints. -{{< /info >}} - -The following examples shows a graph traversal starting at **London**, with a -traversal depth between 2 and 3, and every station is only visited once: - -```aql ---- -name: GRAPHTRAV_graphPruneExample7 -description: '' -dataset: kShortestPathsGraph ---- -FOR v, e, p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - OPTIONS { uniqueVertices: "path" } - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -If you add prune conditions to stop the traversal if the station is **Glasgow** -or the travel time less than some number, no results are turned. This is even the -case for a value of `2.5`, for which two paths exist that fulfill the criterion -– to **Cologne** and **Carlisle**: - -```aql ---- -name: GRAPHTRAV_graphPruneExample8 -description: '' -dataset: kShortestPathsGraph ---- -FOR v,e,p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - PRUNE v.label == "Glasgow" OR e.travelTime < 2.5 - OPTIONS { uniqueVertices: "path" } - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -The problem is that `null`, `false`, and `true` are all less than any number (`< 2.5`) -because of AQL's [Type and value order](../fundamentals/type-and-value-order.md), and -because the edge at depth 0 is always `null`. The prune condition is accidentally -fulfilled at the start vertex, stopping the traversal too early. This similarly -happens if you check an edge attribute for inequality (`!=`) and compare it to -string, for instance, which evaluates to `true` for the `null` value. - -The depth at which a traversal is stopped by pruning is considered as a result, -but in the above example, the minimum depth of `2` filters the start vertex out. -If you lower the minimum depth to `0`, you get **London** as the sole result. -This confirms that the traversal stopped at the start vertex. - -To avoid this problem, exclude the `null` value. For example, you can use -`e.travelTime > 0 AND e.travelTime < 2.5`, but more generic solutions are to -exclude depth 0 from the check (`LENGTH(p.edges) > 0`) or to simply ignore the -`null` edge (`e != null`): - -```aql ---- -name: GRAPHTRAV_graphPruneExample9 -description: '' -dataset: kShortestPathsGraph ---- -FOR v,e,p IN 2..3 OUTBOUND "places/London" GRAPH "kShortestPathsGraph" - PRUNE v.label == "Glasgow" OR (e != null AND e.travelTime < 2.5) - OPTIONS { uniqueVertices: "path" } - RETURN CONCAT_SEPARATOR(" -- ", INTERLEAVE(p.vertices[*].label, p.edges[*].travelTime)) -``` - -{{< warning >}} -You can use AQL functions in prune expressions but only those that can be -executed on DB-Servers, regardless of your deployment mode. The following -functions cannot be used in the expression: -- `CALL()` -- `APPLY()` -- `DOCUMENT()` -- `V8()` -- `SCHEMA_GET()` -- `SCHEMA_VALIDATE()` -- `VERSION()` -- `COLLECTIONS()` -- `CURRENT_USER()` -- `CURRENT_DATABASE()` -- `COLLECTION_COUNT()` -- `NEAR()` -- `WITHIN()` -- `WITHIN_RECTANGLE()` -- `FULLTEXT()` -- [User-defined functions (UDFs)](../user-defined-functions.md) -{{< /warning >}} - -## Using filters - -All three variables emitted by the traversals might as well be used in filter -statements. For some of these filter statements the optimizer can detect that it -is possible to prune paths of traversals earlier, hence filtered results are -not emitted to the variables in the first place. This may significantly -improve the performance of your query. Whenever a filter is not fulfilled, -the complete set of `vertex`, `edge` and `path` is skipped. All paths -with a length greater than the `max` depth are never computed. - -Filter conditions that are `AND`-combined can be optimized, but `OR`-combined -conditions cannot. - -### Filtering on paths - -Filtering on paths allows for the second most powerful filtering and may have the -second highest impact on performance. Using the path variable you can filter on -specific iteration depths. You can filter for absolute positions in the path -by specifying a positive number (which then qualifies for the optimizations), -or relative positions to the end of the path by specifying a negative number. - -#### Filtering edges on the path - -This example traversal filters all paths where the start edge (index 0) has the -attribute `theTruth` equal to `true`. The resulting paths are up to 5 items long: - -```aql ---- -name: GRAPHTRAV_graphFilterEdges -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..5 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[0].theTruth == true - RETURN { vertices: p.vertices[*]._key, edges: p.edges[*].label } -``` - -#### Filtering vertices on the path - -Similar to filtering the edges on the path, you can also filter the vertices: - -```aql ---- -name: GRAPHTRAV_graphFilterVertices -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..5 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.vertices[1]._key == "G" - RETURN { vertices: p.vertices[*]._key, edges: p.edges[*].label } -``` - -#### Combining several filters - -You can combine filters in any way you like: - -```aql ---- -name: GRAPHTRAV_graphFilterCombine -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..5 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[0].theTruth == true - AND p.edges[1].theFalse == false - FILTER p.vertices[1]._key == "G" - RETURN { vertices: p.vertices[*]._key, edges: p.edges[*].label } -``` - -The query filters all paths where the first edge has the attribute -`theTruth` equal to `true`, the first vertex is `"G"` and the second edge has -the attribute `theFalse` equal to `false`. The resulting paths are up to -5 items long. - -**Note**: Despite the `min` depth of 1, this only returns results of -depth 2. This is because for all results in depth 1, the second edge does not -exist and hence cannot fulfill the condition here. - -#### Filter on the entire path - -With the help of array comparison operators filters can also be defined -on the entire path, like `ALL` edges should have `theTruth == true`: - -```aql ---- -name: GRAPHTRAV_graphFilterEntirePath -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..5 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[*].theTruth ALL == true - RETURN { vertices: p.vertices[*]._key, edges: p.edges[*].label } -``` - -Or `NONE` of the edges should have `theTruth == true`: - -```aql ---- -name: GRAPHTRAV_graphFilterPathEdges -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..5 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[*].theTruth NONE == true - RETURN { vertices: p.vertices[*]._key, edges: p.edges[*].label } -``` - -Both examples above are recognized by the optimizer and can potentially use other indexes -than the edge index. - -It is also possible to define that at least one edge on the path has to fulfill the condition: - -```aql ---- -name: GRAPHTRAV_graphFilterPathAnyEdge -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..5 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[*].theTruth ANY == true - RETURN { vertices: p.vertices[*]._key, edges: p.edges[*].label } -``` - -It is guaranteed that at least one, but potentially more edges fulfill the condition. -All of the above filters can be defined on vertices in the exact same way. - -### Filtering on the path vs. filtering on vertices or edges - -Filtering on the path influences the Iteration on your graph. If certain conditions -aren't met, the traversal may stop continuing along this path. - -In contrast filters on vertex or edge only express whether you're interested in the actual value of these -documents. Thus, it influences the list of returned documents (if you return v or e) similar -as specifying a non-null `min` value. If you specify a min value of 2, the traversal over the first -two nodes of these paths has to be executed - you just won't see them in your result array. - -Similar are filters on vertices or edges - the traverser has to walk along these nodes, since -you may be interested in documents further down the path. - -### Examples - -Create a simple symmetric traversal demonstration graph: - -![traversal graph](../../../images/traversal_graph.png) - -```js ---- -name: GRAPHTRAV_01_create_graph -description: '' ---- -~addIgnoreCollection("circles"); -~addIgnoreCollection("edges"); -var examples = require("@arangodb/graph-examples/example-graph"); -var graph = examples.loadGraph("traversalGraph"); -db.circles.toArray(); -db.edges.toArray(); -print("once you don't need them anymore, clean them up:"); -examples.dropGraph("traversalGraph"); -``` - -To get started we select the full graph. For better overview we only return -the vertex IDs: - -```aql ---- -name: GRAPHTRAV_02_traverse_all_a -description: '' -dataset: traversalGraph ---- -FOR v IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_02_traverse_all_b -description: '' -dataset: traversalGraph ---- -FOR v IN 1..3 OUTBOUND 'circles/A' edges RETURN v._key -``` - -We can nicely see that it is heading for the first outer vertex, then goes back to -the branch to descend into the next tree. After that it returns to our start node, -to descend again. As we can see both queries return the same result, the first one -uses the named graph, the second uses the edge collections directly. - -Now we only want the elements of a specific depth (min = max = 2), the ones that -are right behind the fork: - -```aql ---- -name: GRAPHTRAV_03_traverse_3a -description: '' -dataset: traversalGraph ---- -FOR v IN 2..2 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_03_traverse_3b -description: '' -dataset: traversalGraph ---- -FOR v IN 2 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - RETURN v._key -``` - -As you can see, we can express this in two ways: with or without the `max` depth -parameter. - -### Filter examples - -Now let's start to add some filters. We want to cut of the branch on the right -side of the graph, we may filter in two ways: - -- we know the vertex at depth 1 has `_key` == `G` -- we know the `label` attribute of the edge connecting **A** to **G** is `right_foo` - -```aql ---- -name: GRAPHTRAV_04_traverse_4a -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.vertices[1]._key != 'G' - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_04_traverse_4b -description: '' -dataset: traversalGraph ---- -FOR v, e, p IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[0].label != 'right_foo' - RETURN v._key -``` - -As we can see, all vertices behind **G** are skipped in both queries. -The first filters on the vertex `_key`, the second on an edge label. -Note again, as soon as a filter is not fulfilled for any of the three elements -`v`, `e` or `p`, the complete set of these is excluded from the result. - -We also may combine several filters, for instance to filter out the right branch -(**G**), and the **E** branch: - -```aql ---- -name: GRAPHTRAV_05_traverse_5a -description: '' -dataset: traversalGraph ---- -FOR v,e,p IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.vertices[1]._key != 'G' - FILTER p.edges[1].label != 'left_blub' - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_05_traverse_5b -description: '' -dataset: traversalGraph ---- -FOR v,e,p IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.vertices[1]._key != 'G' AND p.edges[1].label != 'left_blub' - RETURN v._key -``` - -As you can see, combining two `FILTER` statements with an `AND` has the same result. - -## Comparing OUTBOUND / INBOUND / ANY - -All our previous examples traversed the graph in `OUTBOUND` edge direction. -You may however want to also traverse in reverse direction (`INBOUND`) or -both (`ANY`). Since `circles/A` only has outbound edges, we start our queries -from `circles/E`: - -```aql ---- -name: GRAPHTRAV_06_traverse_6a -description: '' -dataset: traversalGraph ---- -FOR v IN 1..3 OUTBOUND 'circles/E' GRAPH 'traversalGraph' - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_06_traverse_6b -description: '' -dataset: traversalGraph ---- -FOR v IN 1..3 INBOUND 'circles/E' GRAPH 'traversalGraph' - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_06_traverse_6c -description: '' -dataset: traversalGraph ---- -FOR v IN 1..3 ANY 'circles/E' GRAPH 'traversalGraph' - RETURN v._key -``` - -The first traversal only walks in the forward (`OUTBOUND`) direction. -Therefore from **E** we only can see **F**. Walking in reverse direction -(`INBOUND`), we see the path to **A**: **B** → **A**. - -Walking in forward and reverse direction (`ANY`) we can see a more diverse result. -First of all, we see the simple paths to **F** and **A**. However, these vertices -have edges in other directions and they are traversed. - -**Note**: The traverser may use identical edges multiple times. For instance, -if it walks from **E** to **F**, it continues to walk from **F** to **E** -using the same edge once again. Due to this, we see duplicate nodes in the result. - -Please note that the direction can't be passed in by a bind parameter. - -## Use the AQL explainer for optimizations - -Now let's have a look what the optimizer does behind the curtain and inspect -traversal queries using [the explainer](../execution-and-performance/query-optimization.md): - -```aql ---- -name: GRAPHTRAV_07_traverse_7 -description: '' -dataset: traversalGraph -explain: true ---- -FOR v,e,p IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - LET localScopeVar = RAND() > 0.5 - FILTER p.edges[0].theTruth != localScopeVar - RETURN v._key -``` - -```aql ---- -name: GRAPHTRAV_07_traverse_8 -description: '' -dataset: traversalGraph -explain: true ---- -FOR v,e,p IN 1..3 OUTBOUND 'circles/A' GRAPH 'traversalGraph' - FILTER p.edges[0].label == 'right_foo' - RETURN v._key -``` - -We now see two queries: In one we add a `localScopeVar` variable, which is outside -the scope of the traversal itself - it is not known inside of the traverser. -Therefore, this filter can only be executed after the traversal, which may be -undesired in large graphs. The second query on the other hand only operates on the -path, and therefore this condition can be used during the execution of the traversal. -Paths that are filtered out by this condition won't be processed at all. - -And finally clean it up again: - -```js ---- -name: GRAPHTRAV_99_drop_graph -description: '' ---- -~examples.loadGraph("traversalGraph"); -var examples = require("@arangodb/graph-examples/example-graph"); -examples.dropGraph("traversalGraph"); -``` - -If this traversal is not powerful enough for your needs, like you cannot describe -your conditions as AQL filter statements, then you might want to have a look at -the [edge collection methods](../../develop/javascript-api/@arangodb/collection-object.md#edge-documents) -in the JavaScript API. - -Also see how to [combine graph traversals](../examples-and-query-patterns/traversals.md). diff --git a/site/content/3.10/aql/how-to-invoke-aql/with-arangosh.md b/site/content/3.10/aql/how-to-invoke-aql/with-arangosh.md deleted file mode 100644 index a2a7a53b53..0000000000 --- a/site/content/3.10/aql/how-to-invoke-aql/with-arangosh.md +++ /dev/null @@ -1,726 +0,0 @@ ---- -title: Executing AQL queries from _arangosh_ -menuTitle: with arangosh -weight: 5 -description: >- - How to run queries, set bind parameters, and obtain the resulting and - additional information using the JavaScript API ---- -In the ArangoDB shell, you can use the `db._query()` and `db._createStatement()` -methods to execute AQL queries. This chapter also describes -how to use bind parameters, counting, statistics and cursors. - -## With `db._query()` - -`db._query() → cursor` - -You can execute queries with the `_query()` method of the `db` object. -This runs the specified query in the context of the currently -selected database and returns the query results in a cursor. -You can print the results of the cursor using its `toArray()` method: - -```js ---- -name: 01_workWithAQL_all -description: '' ---- -~addIgnoreCollection("mycollection") -var coll = db._create("mycollection") -var doc = db.mycollection.save({ _key: "testKey", Hello : "World" }) -db._query('FOR my IN mycollection RETURN my._key').toArray() -``` - -### `db._query()` bind parameters - -`db._query(, ) → cursor` - -To pass bind parameters into a query, you can specify a second argument when -calling the `_query()` method: - -```js ---- -name: 02_workWithAQL_bindValues -description: '' ---- -db._query('FOR c IN @@collection FILTER c._key == @key RETURN c._key', { - '@collection': 'mycollection', - 'key': 'testKey' -}).toArray(); -``` - -### ES6 template strings - -`` aql`` `` - -It is also possible to use ES6 template strings for generating AQL queries. There is -a template string generator function named `aql`. - -The following example demonstrates what the template string function generates: - -```js ---- -name: 02_workWithAQL_aqlTemplateString -description: '' ---- -var key = 'testKey'; -aql`FOR c IN mycollection FILTER c._key == ${key} RETURN c._key` -``` - -The next example directly uses the generated result to execute a query: - -```js ---- -name: 02_workWithAQL_aqlQuery -description: '' ---- -var key = 'testKey'; -db._query( - aql`FOR c IN mycollection FILTER c._key == ${key} RETURN c._key` -).toArray(); -``` - -Arbitrary JavaScript expressions can be used in queries that are generated with the -`aql` template string generator. Collection objects are handled automatically: - -```js ---- -name: 02_workWithAQL_aqlCollectionQuery -description: '' ---- -var key = 'testKey'; -db._query(aql`FOR doc IN ${ db.mycollection } RETURN doc`).toArray(); -``` - -Note: data-modification AQL queries normally do not return a result unless the -AQL query contains a `RETURN` operation at the top-level. Without a `RETURN` -operation, the `toArray()` method returns an empty array. - -### Statistics and extra Information - -`cursor.getExtra() → queryInfo` - -It is always possible to retrieve statistics for a query with the `getExtra()` method: - -```js ---- -name: 03_workWithAQL_getExtra -description: '' ---- -db._query(` - FOR i IN 1..100 - INSERT { _key: CONCAT('test', TO_STRING(i)) } INTO mycollection -`).getExtra(); -``` - -The meaning of the statistics values is described in -[Query statistics](../execution-and-performance/query-statistics.md). - -Query warnings are also reported here. If you design queries on the shell, -be sure to check for warnings. - -### Main query options - -`db._query(, , , ) → cursor` - -You can pass the main options as the third argument to `db._query()` if you -also pass a fourth argument with the sub options (can be an empty object `{}`). - -#### `count` - -Whether the number of documents in the result set should be calculated on the -server side and returned in the `count` attribute of the result. Calculating the -`count` attribute might have a performance impact for some queries so this -option is turned off by default, and only returned when requested. - -If enabled, you can get the count by calling the `count()` method of the cursor. -You can also count the number of results on the client side, for example, using -`cursor.toArray().length`. - -```js ---- -name: 02_workWithAQL_count -description: '' ---- -var cursor = db._query( - 'FOR i IN 1..42 RETURN i', - {}, - { count: true }, - {} -); -cursor.count(); -cursor.toArray().length; -``` - -#### `batchSize` - -The maximum number of result documents to be transferred from the server to the -client in one roundtrip. If this attribute is not set, a server-controlled -default value is used. A `batchSize` value of `0` is disallowed. - -```js ---- -name: 02_workWithAQL_batchSize -description: '' ---- -db._query( - 'FOR i IN 1..3 RETURN i', - {}, - { batchSize: 2 }, - {} -).toArray(); // full result retrieved in two batches -``` - -#### `ttl` - -The time-to-live for the cursor (in seconds). If the result set is small enough -(less than or equal to `batchSize`), then results are returned right away. -Otherwise, they are stored in memory and are accessible via the cursor with -respect to the `ttl`. The cursor is removed on the server automatically after -the specified amount of time. This is useful to ensure garbage collection of -cursors that are not fully fetched by clients. If not set, a server-defined -value is used (default: 30 seconds). - -```js ---- -name: 02_workWithAQL_ttl -description: '' ---- -db._query( - 'FOR i IN 1..20 RETURN i', - {}, - { ttl: 5, batchSize: 10 }, - {} -).toArray(); // Each batch needs to be fetched within 5 seconds -``` - -#### `cache` - -Whether the AQL query results cache shall be used. If set to `false`, then any -query cache lookup is skipped for the query. If set to `true`, it leads to the -query cache being checked for the query **if** the query cache mode is either -set to `on` or `demand`. - -```js ---- -name: 02_workWithAQL_cache -description: '' ---- -db._query( - 'FOR i IN 1..20 RETURN i', - {}, - { cache: true }, - {} -); // result may get taken from cache -``` - -#### `memoryLimit` - -To set a memory limit for the query, pass `options` to the `_query()` method. -The memory limit specifies the maximum number of bytes that the query is -allowed to use. When a single AQL query reaches the specified limit value, -the query will be aborted with a *resource limit exceeded* exception. In a -cluster, the memory accounting is done per shard, so the limit value is -effectively a memory limit per query per shard. - -```js ---- -name: 02_workWithAQL_memoryLimit -description: '' ---- -db._query( - 'FOR i IN 1..100000 SORT i RETURN i', - {}, - { memoryLimit: 100000 } -).toArray(); // xpError(ERROR_RESOURCE_LIMIT) -``` - -If no memory limit is specified, then the server default value (controlled by -the `--query.memory-limit` startup option) is used for restricting the maximum amount -of memory the query can use. A memory limit value of `0` means that the maximum -amount of memory for the query is not restricted. - -### Query sub options - -`db._query(, , ) → cursor` - -`db._query(, , , ) → cursor` - -You can pass the sub options as the third argument to `db._query()` if you don't -provide main options, or as fourth argument if you do. - -#### `fullCount` - -If you set `fullCount` to `true` and if the query contains a `LIMIT` operation, then the -result has an extra attribute with the sub-attributes `stats` and `fullCount`, like -`{ ... , "extra": { "stats": { "fullCount": 123 } } }`. The `fullCount` attribute -contains the number of documents in the result before the last top-level `LIMIT` in the -query was applied. It can be used to count the number of documents that match certain -filter criteria, but only return a subset of them, in one go. It is thus similar to -MySQL's `SQL_CALC_FOUND_ROWS` hint. Note that setting the option disables a few -`LIMIT` optimizations and may lead to more documents being processed, and thus make -queries run longer. Note that the `fullCount` attribute may only be present in the -result if the query has a top-level `LIMIT` operation and the `LIMIT` operation -is actually used in the query. - -#### `failOnWarning` -If you set `failOnWarning` to `true`, this makes the query throw an exception and -abort in case a warning occurs. You should use this option in development to catch -errors early. If set to `false`, warnings don't propagate to exceptions and are -returned with the query results. There is also a `--query.fail-on-warning` -startup options for setting the default value for `failOnWarning`, so that you -don't need to set it on a per-query level. - -#### `cache` - -If you set `cache` to `true`, this puts the query result into the query result cache -if the query result is eligible for caching and the query cache is running in demand -mode. If set to `false`, the query result is not inserted into the query result -cache. Note that query results are never inserted into the query result cache if -the query result cache is disabled, and that they are automatically inserted into -the query result cache if it is active in non-demand mode. - -#### `fillBlockCache` - -If you set `fillBlockCache` to `true` or not specify it, this makes the query store -the data it reads via the RocksDB storage engine in the RocksDB block cache. This is -usually the desired behavior. You can set the option to `false` for queries that are -known to either read a lot of data that would thrash the block cache, or for queries -that read data known to be outside of the hot set. By setting the option -to `false`, data read by the query does not make it into the RocksDB block cache if -it is not already in there, thus leaving more room for the actual hot set. - -#### `profile` - -If you set `profile` to `true` or `1`, extra timing information is returned for the query. -The timing information is accessible via the `getExtra()` method of the query -result. If set to `2`, the query includes execution statistics per query plan -execution node in `stats.nodes` sub-attribute of the `extra` return attribute. -Additionally, the query plan is returned in the `extra.plan` sub-attribute. - -#### `maxWarningCount` - -The `maxWarningCount` option limits the number of warnings that are returned by the query if -`failOnWarning` is not set to `true`. The default value is `10`. - -#### `maxNumberOfPlans` - -The `maxNumberOfPlans` option limits the number of query execution plans the optimizer -creates at most. Reducing the number of query execution plans may speed up query plan -creation and optimization for complex queries, but normally there is no need to adjust -this value. - -#### `optimizer` - -Options related to the query optimizer. - -- `rules`: A list of to-be-included or to-be-excluded optimizer rules can be put into - this attribute, telling the optimizer to include or exclude specific rules. To disable - a rule, prefix its name with a `-`, to enable a rule, prefix it with a `+`. There is also - a pseudo-rule `all`, which matches all optimizer rules. `-all` disables all rules. - -#### `stream` - -Set `stream` to `true` to execute the query in a **streaming** fashion. -The query result is not stored on the server, but calculated on the fly. - -{{< warning >}} -Long-running queries need to hold the collection locks for as long as the query -cursor exists. It is advisable to **only** use this option on short-running -queries **or** without exclusive locks. -{{< /warning >}} - -If set to `false`, the query is executed right away in its entirety. -In that case, the query results are either returned right away (if the result -set is small enough), or stored on the arangod instance and can be accessed -via the cursor API. - -The default value is `false`. - -{{< info >}} -The query options `cache`, `count` and `fullCount` don't work on streaming -queries. Additionally, query statistics, profiling data, and warnings are only -available after the query has finished and are delivered as part of the last batch. -{{< /info >}} - -#### `maxRuntime` - -The query has to be executed within the given runtime or it is killed. -The value is specified in seconds. The default value is `0.0` (no timeout). - -#### `maxNodesPerCallstack` - -The number of execution nodes in the query plan after -that stack splitting is performed to avoid a potential stack overflow. -Defaults to the configured value of the startup option -`--query.max-nodes-per-callstack`. - -This option is only useful for testing and debugging and normally does not need -any adjustment. - -#### `maxTransactionSize` - -The transaction size limit in bytes. - -#### `intermediateCommitSize` - -The maximum total size of operations after which an intermediate -commit is performed automatically. - -#### `intermediateCommitCount` - -The maximum number of operations after which an intermediate -commit is performed automatically. - -#### `spillOverThresholdMemoryUsage` - -Introduced in: v3.10.0 - -This option allows queries to store intermediate and final results temporarily -on disk if the amount of memory used (in bytes) exceeds the specified value. -This is used for decreasing the memory usage during the query execution. - -This option only has an effect on queries that use the `SORT` operation but -without a `LIMIT`, and if you enable the spillover feature by setting a path -for the directory to store the temporary data in with the -[`--temp.intermediate-results-path` startup option](../../components/arangodb-server/options.md#--tempintermediate-results-path). - -Default value: 128MB. - -{{< info >}} -Spilling data from RAM onto disk is an experimental feature and is turned off -by default. The query results are still built up entirely in RAM on Coordinators -and single servers for non-streaming queries. To avoid the buildup of -the entire query result in RAM, use a streaming query (see the -[`stream`](#stream) option). -{{< /info >}} - -#### `spillOverThresholdNumRows` - -Introduced in: v3.10.0 - -This option allows queries to store intermediate and final results temporarily -on disk if the number of rows produced by the query exceeds the specified value. -This is used for decreasing the memory usage during the query execution. In a -query that iterates over a collection that contains documents, each row is a -document, and in a query that iterates over temporary values -(i.e. `FOR i IN 1..100`), each row is one of such temporary values. - -This option only has an effect on queries that use the `SORT` operation but -without a `LIMIT`, and if you enable the spillover feature by setting a path -for the directory to store the temporary data in with the -[`--temp.intermediate-results-path` startup option](../../components/arangodb-server/options.md#--tempintermediate-results-path). - -Default value: `5000000` rows. - -{{< info >}} -Spilling data from RAM onto disk is an experimental feature and is turned off -by default. The query results are still built up entirely in RAM on Coordinators -and single servers for non-streaming queries. To avoid the buildup of -the entire query result in RAM, use a streaming query (see the -[`stream`](#stream) option). -{{< /info >}} - -#### `allowDirtyReads` - -{{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - -Introduced in: v3.10.0 - -If you set this option to `true` and execute the query against a cluster -deployment, then the Coordinator is allowed to read from any shard replica and -not only from the leader. See [Read from followers](../../develop/http-api/documents.md#read-from-followers) -for details. - -#### `skipInaccessibleCollections` - -{{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - -Let AQL queries (especially graph traversals) treat collection to which a -user has **no access** rights for as if these collections are empty. -Instead of returning a *forbidden access* error, your queries execute normally. -This is intended to help with certain use-cases: A graph contains several collections -and different users execute AQL queries on that graph. You can naturally limit the -accessible results by changing the access rights of users on collections. - -#### `satelliteSyncWait` - -{{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - -Configure how long a DB-Server has time to bring the SatelliteCollections -involved in the query into sync. The default value is `60.0` seconds. -When the maximal time is reached, the query is stopped. - -## With `db._createStatement()` (ArangoStatement) - -The `_query()` method is a shorthand for creating an `ArangoStatement` object, -executing it and iterating over the resulting cursor. If more control over the -result set iteration is needed, it is recommended to first create an -`ArangoStatement` object as follows: - -```js ---- -name: 04_workWithAQL_statements1 -description: '' ---- -stmt = db._createStatement( { "query": "FOR i IN [ 1, 2 ] RETURN i * 2" } ); -``` - -To execute the query, use the `execute()` method of the _statement_ object: - -```js ---- -name: 05_workWithAQL_statements2 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ 1, 2 ] RETURN i * 2" } ); -cursor = stmt.execute(); -``` - -You can pass a number to the `execute()` method to specify a batch size value. -The server returns at most this many results in one roundtrip. -The batch size cannot be adjusted after the query is first executed. - -**Note**: There is no need to explicitly call the execute method if another -means of fetching the query results is chosen. The following two approaches -lead to the same result: - -```js ---- -name: executeQueryNoBatchSize -description: '' ---- -~db._create("users"); -~db.users.save({ name: "Gerhard" }); -~db.users.save({ name: "Helmut" }); -~db.users.save({ name: "Angela" }); -var result = db.users.all().toArray(); -print(result); - -var q = db._query("FOR x IN users RETURN x"); -result = [ ]; -while (q.hasNext()) { - result.push(q.next()); -} -print(result); -~db._drop("users") -``` - -The following two alternatives both use a batch size and return the same -result: - -```js ---- -name: executeQueryBatchSize -description: '' ---- -~db._create("users"); -~db.users.save({ name: "Gerhard" }); -~db.users.save({ name: "Helmut" }); -~db.users.save({ name: "Angela" }); -var result = [ ]; -var q = db.users.all(); -q.execute(1); -while(q.hasNext()) { - result.push(q.next()); -} -print(result); - -result = [ ]; -q = db._query("FOR x IN users RETURN x", {}, { batchSize: 1 }); -while (q.hasNext()) { - result.push(q.next()); -} -print(result); -~db._drop("users") -``` - -### Cursors - -Once the query executed the query results are available in a cursor. -The cursor can return all its results at once using the `toArray()` method. -This is a short-cut that you can use if you want to access the full result -set without iterating over it yourself. - -```js ---- -name: 05_workWithAQL_statements3 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ 1, 2 ] RETURN i * 2" } ); -~var cursor = stmt.execute(); -cursor.toArray(); -``` - -Cursors can also be used to iterate over the result set document-by-document. -To do so, use the `hasNext()` and `next()` methods of the cursor: - -```js ---- -name: 05_workWithAQL_statements4 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ 1, 2 ] RETURN i * 2" } ); -~var c = stmt.execute(); -while (c.hasNext()) { - require("@arangodb").print(c.next()); -} -``` - -Please note that you can iterate over the results of a cursor only once, and that -the cursor will be empty when you have fully iterated over it. To iterate over -the results again, the query needs to be re-executed. - -Additionally, the iteration can be done in a forward-only fashion. There is no -backwards iteration or random access to elements in a cursor. - -### ArangoStatement parameters binding - -To execute an AQL query using bind parameters, you need to create a statement first -and then bind the parameters to it before execution: - -```js ---- -name: 05_workWithAQL_statements5 -description: '' ---- -var stmt = db._createStatement( { "query": "FOR i IN [ @one, @two ] RETURN i * 2" } ); -stmt.bind("one", 1); -stmt.bind("two", 2); -cursor = stmt.execute(); -``` - -The cursor results can then be dumped or iterated over as usual, e.g.: - -```js ---- -name: 05_workWithAQL_statements6 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ @one, @two ] RETURN i * 2" } ); -~stmt.bind("one", 1); -~stmt.bind("two", 2); -~var cursor = stmt.execute(); -cursor.toArray(); -``` - -or - -```js ---- -name: 05_workWithAQL_statements7 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ @one, @two ] RETURN i * 2" } ); -~stmt.bind("one", 1); -~stmt.bind("two", 2); -~var cursor = stmt.execute(); -while (cursor.hasNext()) { - require("@arangodb").print(cursor.next()); -} -``` - -Please note that bind parameters can also be passed into the `_createStatement()` -method directly, making it a bit more convenient: - -```js ---- -name: 05_workWithAQL_statements8 -description: '' ---- -stmt = db._createStatement({ - "query": "FOR i IN [ @one, @two ] RETURN i * 2", - "bindVars": { - "one": 1, - "two": 2 - } -}); -``` - -### Counting with a cursor - -Cursors also optionally provide the total number of results. By default, they do not. -To make the server return the total number of results, you may set the `count` attribute to -`true` when creating a statement: - -```js ---- -name: 05_workWithAQL_statements9 -description: '' ---- -stmt = db._createStatement( { - "query": "FOR i IN [ 1, 2, 3, 4 ] RETURN i", - "count": true } ); -``` - -After executing this query, you can use the `count` method of the cursor to get the -number of total results from the result set: - -```js ---- -name: 05_workWithAQL_statements10 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ 1, 2, 3, 4 ] RETURN i", "count": true } ); -var cursor = stmt.execute(); -cursor.count(); -``` - -Please note that the `count` method returns nothing if you did not specify the `count` -attribute when creating the query. - -This is intentional so that the server may apply optimizations when executing the query and -construct the result set incrementally. Incremental creation of the result sets -is no possible -if all of the results need to be shipped to the client anyway. Therefore, the client -has the choice to specify `count` and retrieve the total number of results for a query (and -disable potential incremental result set creation on the server), or to not retrieve the total -number of results and allow the server to apply optimizations. - -Please note that at the moment the server will always create the full result set for each query so -specifying or omitting the `count` attribute currently does not have any impact on query execution. -This may change in the future. Future versions of ArangoDB may create result sets incrementally -on the server-side and may be able to apply optimizations if a result set is not fully fetched by -a client. - -### Using cursors to obtain additional information on internal timings - -Cursors can also optionally provide statistics of the internal execution phases. By default, they do not. -To get to know how long parsing, optimization, instantiation and execution took, -make the server return that by setting the `profile` attribute to -`true` when creating a statement: - -```js ---- -name: 06_workWithAQL_statements11 -description: '' ---- -stmt = db._createStatement({ - query: "FOR i IN [ 1, 2, 3, 4 ] RETURN i", - options: {"profile": true}}); -``` - -After executing this query, you can use the `getExtra()` method of the cursor to get the -produced statistics: - -```js ---- -name: 06_workWithAQL_statements12 -description: '' ---- -~var stmt = db._createStatement( { "query": "FOR i IN [ 1, 2, 3, 4 ] RETURN i", options: {"profile": true}} ); -var cursor = stmt.execute(); -cursor.getExtra(); -``` - -## Query validation with `db._parse()` - -The `_parse()` method of the `db` object can be used to parse and validate a -query syntactically, without actually executing it. - -```js ---- -name: 06_workWithAQL_statements13 -description: '' ---- -db._parse( "FOR i IN [ 1, 2 ] RETURN i" ); -``` diff --git a/site/content/3.10/arangograph/_index.md b/site/content/3.10/arangograph/_index.md deleted file mode 100644 index 9ba6efedf4..0000000000 --- a/site/content/3.10/arangograph/_index.md +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: ArangoGraph Insights Platform -menuTitle: ArangoGraph -weight: 65 -description: >- - The ArangoGraph Insights Platform provides the entire functionality of - ArangoDB as a service, without the need to run or manage databases yourself -aliases: - - arangograph/changelog ---- -The [ArangoGraph Insights Platform](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic), -formerly called Oasis, provides ArangoDB databases as a Service (DBaaS). -It enables you to use the entire functionality of an ArangoDB cluster -deployment without the need to run or manage the system yourself. - -The ArangoGraph Insights Platform... - -- runs your databases in data centers of the cloud provider - of your choice: Google Cloud Platform (GCP), Amazon Web Services (AWS), - Microsoft Azure. This optimizes performance and reduces cost. - -- ensures that your databases are always available and - healthy by monitoring them 24/7. - -- ensures that your databases are kept up to date by - installing new versions without service interruption. - -- ensures that your data is safe by providing encryption & - audit logs and making frequent data backups. - -- guarantees that your data always remains your property and - access to it is protected with industry standard safeguards. - -For more information see -[dashboard.arangodb.cloud](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic) - -For quick start guide, see -[Use ArangoDB in the Cloud](../get-started/set-up-a-cloud-instance.md). diff --git a/site/content/3.10/arangograph/api/_index.md b/site/content/3.10/arangograph/api/_index.md deleted file mode 100644 index ee4f21371f..0000000000 --- a/site/content/3.10/arangograph/api/_index.md +++ /dev/null @@ -1,37 +0,0 @@ ---- -title: The ArangoGraph API -menuTitle: ArangoGraph API -weight: 60 -description: >- - Interface to control all resources inside ArangoGraph in a scriptable manner -aliases: - - arangograph-api ---- -The [ArangoGraph Insights Platform](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic), -comes with its own API. This API enables you to control all -resources inside ArangoGraph in a scriptable manner. Typical use cases are spinning -up ArangoGraph deployments during continuous integration and infrastructure as code. - -The ArangoGraph API… - -- is a well-specified API that uses - [Protocol Buffers](https://developers.google.com/protocol-buffers/) - as interface definition and [gRPC](https://grpc.io/) as - underlying protocol. - -- allows for automatic generation of clients for a large list of languages. - A Go client is available out of the box. - -- uses API keys for authentication. API keys impersonate a user and inherit - the permissions of that user. - -- is also available as a command-line tool called [oasisctl](../oasisctl/_index.md). - -- is also available as a - [Terraform plugin](https://github.com/arangodb-managed/terraform-provider-oasis/). - This plugin makes integration of ArangoGraph in infrastructure as code projects - very simple. To learn more, refer to the [plugin documentation](https://registry.terraform.io/providers/arangodb-managed/oasis/latest/docs). - -Also see: -- [github.com/arangodb-managed/apis](https://github.com/arangodb-managed/apis/) -- [API definitions](https://arangodb-managed.github.io/apis/index.html) diff --git a/site/content/3.10/arangograph/api/get-started.md b/site/content/3.10/arangograph/api/get-started.md deleted file mode 100644 index ee72c989a8..0000000000 --- a/site/content/3.10/arangograph/api/get-started.md +++ /dev/null @@ -1,481 +0,0 @@ ---- -title: Get started with the ArangoGraph API and Oasisctl -menuTitle: Get started with Oasisctl -weight: 10 -description: >- - A tutorial that guides you through the ArangoGraph API as well as the Oasisctl - command-line tool -aliases: - - ../arangograph-api/getting-started ---- -This tutorial shows you how to do the following: - -- Generate an API key and authenticate with Oasisctl -- View information related to your organizations, projects, and deployments -- Configure, create and delete a deployment - -With Oasisctl the general command structure is to execute commands such as: - -``` -oasisctl list deployments -``` - -This command lists all deployments available to the authenticated user and we -will explore it in more detail later. Most commands also have associated -`--flags` that are required or provide additional options, this aligns with the -interaction method for many command line utilities. If you aren’t already -familiar with this, follow along as there are many examples in this guide that -will familiarize you with this command structure and using flags, along with -how to use OasisCtl to access the ArangoGraph API. - -Note: A good rule of thumb for all variables, resource names, and identifiers -is to **assume they are all case sensitive**, when being used with Oasisctl. - -## API Authentication - -### Generating an API Key - -The first step to using the ArangoGraph API is to generate an API key. To generate a -key you will need to be signed into your account at -[dashboard.arangodb.cloud](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic). -Once you are signed in, hover over the profile icon in the top right corner. - -![Profile Icon](../../../images/arangograph-my-account-hover.png) - -Click _My API keys_. - -This will bring you to your API key management screen. From this screen you can -create, reject, and delete API keys. - -Click the _New API key_ button. - -![Blank API Screen](../../../images/arangograph-my-api-keys.png) - -The pop-up box that follows has a few options for customizing the access level -of this API key. - -The options you have available include: - -- Limit access to 1 organization or all organizations this user has access to -- Set an expiration time, specified in number of hours -- Limit key to read-only access - -Once you have configured the API key access options, you will be presented with -your API key ID and API key secret. It is very important that you capture the -API key secret before clicking the close button. There is no way to retrieve -the API key secret after closing this pop-up window. - -![API Secret Key](../../../images/arangograph-api-key-secret.png) - -Once you have securely stored your API key ID and secret, click close. - -That is all there is to setting up API access to your ArangoGraph organizations. - -### Authenticating with Oasisctl - -Now that you have API access it is time to login with Oasisctl. - -Running the Oasisctl utility without any arguments is the equivalent of -including the --help flag. This shows all of the top level commands available -and you can continue exploring each command by typing the command name -followed by the --help flag to see the options available for that command. - -Let’s start with doing that for the login command: - -```bash -oasisctl login --help -``` - -You should see an output similar to this: - -![login help output](../../../images/oasisctl-login-help.png) - -This shows two additional flags are available, aside from the help flag. - -- `--key-id` -- `--key-secret` - -These require the values we received when creating the API key. Once you run -this command you will receive an authentication token that can be used for the -remainder of the session. - -```bash -oasisctl login \ - --key-id cncApiKeyId \ - --key-secret 873-secret-key-id -``` - -Upon successful login you should receive an authentication token: - -![On Successful Login](../../../images/oasisctl-login-success.png) - -Depending on your environment, you could instead store this token for easier -access. For example: - -With Linux: - -```bash -export OASIS_TOKEN=$(oasisctl login --key-id cncApiKeyId --key-secret 873-secret-key-id) -``` - -Or Windows Powershell: - -```powershell -setx OASIS_TOKEN (oasisctl login --key-id cncApiKeyId --key-secret 873-secret-key-id) -``` - -In the coming sections you will see how to authenticate with this token when -using other commands that require authentication. - -## Viewing and Managing Organizations and Deployments - -### Format - -This section covers the basics of retrieving information from the ArangoGraph API. -Depending on the data you are requesting from the ArangoGraph API, being able to read -it in the command line can start to become difficult. To make text easier to -read for humans and your applications, Oasisctl offers two options for -formatting the data received: - -- Table -- JSON - -You can define the format of the data by supplying the `--format` flag along -with your preferred format, like so: - -```bash -oasisctl --format json -``` - -### Viewing Information with the List Command - -This section will cover the two main functions of retrieving data with the -ArangoGraph API. These are: - -- `list` - List resources -- `get` - Get information - -Before you can jump right into making new deployments you need to be aware of -what resources you have available. This is where the list command comes in. -List serves as a way to retrieve general information, you can see all of the -available list options by accessing its help output. - -```bash -oasisctl list --help -``` - -This should output a screen similar to: - -![List help output](../../../images/oasisctl-list-help.png) - -As you can see you can get information on anything you would need about your -ArangoGraph organizations, deployments, and access control. To start, let’s take a -look at a few examples of listing information and then getting more details on -our results. - -### List Organizations - -One of the first pieces of information you may be interested in is the -organizations you have access to. This is useful to know because most commands -require an explicit declaration of the organization you are interacting with. -To find this, use list to list your available organizations: - -```bash -oasisctl list organizations --format json -``` - -Once you have your available organizations you can refer to your desired -organization using its name or id. - -![List organizations output](../../../images/oasisctl-list-org.png) - -Note: You may also notice the url attribute, this is for internal use only and -should not be treated as a publicly accessible path. - -### List Projects - -Once you have the organization name that you wish to interact with, the next -step is to list the available projects within that organization. Do this by -following the same command structure as before and instead exchange -organizations for projects, this time providing the desired organization name -with the `--organization-id` flag. - -```bash -oasisctl list projects \ - --organization-id "ArangoGraph Organization" \ - --format json -``` - -This will return information on all projects that the authenticated user has -access to. - -![List projects output](../../../images/oasisctl-list-projects.png) - -### List Deployments - -Things start getting a bit more interesting with information related to -deployments. Now that you have obtained an organization iD and a project ID, -you can list all of the associated deployments for that project. - -```bash -oasisctl list deployments \ - --organization-id "ArangoGraph Organization" \ - --project-id "Getting Started with ArangoGraph" \ - --format json - ``` - -![List deployments output](../../../images/oasisctl-list-deployments.png) - -This provides some basic details for all of the deployments associated with the -project. Namely, it provides a deployment ID which we can use to start making -modifications to the deployment or to get more detailed information, with the -`get` command. - -### Using the Get Command - -In Oasisctl, you use the get command to obtain more detailed information about -any of your available resources. It follows the same command structure as the -previous commands but typically requires a bit more information. For example, -to get more information on a specific deployment means you need to know at -least: - -- Organization ID -- Project ID -- Deployment ID - -To get more information about our example deployment we would need to execute -the following command: - -```bash -oasisctl get deployment \ - --organization-id "ArangoGraph Organization" \ - --project-id "Getting Started with ArangoGraph" \ - --deployment-id "abc123DeploymentID" \ - --format json -``` - -This returns quite a bit more information about the deployment including more -detailed server information, the endpoint URL where you can access the web interface, -and optionally the root user password. - -![Get deployment details](../../../images/oasisctl-get-deployment.png) - -### Node Size ID - -We won’t be exploring every flag available for creating a deployment but it is -a good idea to explore the concept of the node size ID value. This is an -indicator that is unique to each provider (Google, Azure, AWS) and indicates -the CPU and memory. Depending on the provider and region this can also -determine the available disk sizes for your deployment. In other words, it is -pretty important to know which `node-size-id` your deployment will be using. - -The command you execute will determine on the available providers and regions -for your organization but here is an example command that lists the available -options in the US West region for the Google Cloud Platform: - -```bash -oasisctl list nodesizes \ - --organization-id "ArangoGraph Organization" \ - --provider-id "Google Cloud Platform" \ - --region-id gcp-us-west2 -``` - -The output you will see will be similar to this: - -![List node size id](../../../images/oasisctl-list-node-size-id.png) - -It is important to note that you can scale up with more disk size but you are -unable to scale down your deployment disk size. The only way to revert back to -a lower disk size is to destroy and recreate your deployment. - -Once you have decided what your starting deployment needs are you can reference -your decision with the Id value for the corresponding configuration. So, for -our example, we will be choosing the c4-a4 configuration. The availability and -options are different for each provider and region, so be sure to confirm the -node size options before creating a new deployment. - -### Challenge - -You can use this combination of listing and getting to obtain all of the -information you want for your ArangoGraph organizations. We only explored a few of -the commands available but you can explore them all within the utility by -utilizing the `--help` flag or you can see all of the available options -in the [documentation](../oasisctl/options.md). - -Something that might be useful practice before moving on is getting the rest -of the information that you need to create a deployment. Here are a list of -items that won’t have defaults available when you attempt to create your -first deployment and you will need to supply: - -- CA Certificate ID (name) -- IP Allowlist ID (id) (optional) -- Node Size ID (id) -- Node Disk Size (GB disk size dependent on Node Size ID) -- Organization ID (name) -- Project ID (name) -- Region ID (name) - -Try looking up that information to get more familiar with how to find -information with Oasisctl. When in doubt use the `--help` flag with any -command. - -## Creating Resources - -Now that you have seen how to obtain information about your available -resources, it’s time to start using those skills to start creating your own -deployment. To create resources with Oasisctl you use the create command. -To see all the possible options you can start with the following command: - -```bash -oasisctl create --help -``` - -![Create command help output](../../../images/oasisctl-create-help.png) - -### Create a Deployment - -To take a look at all of the options available when creating a deployment the -best place to start is with our trusty help command. - -```bash -oasisctl create deployment --help -``` - -![Create deployment help output](../../../images/oasisctl-create-deployment-help.png) - -As you can see there are a lot of default options but also a few that require -some knowledge of our pre-existing resources. Attempting to create a deployment -without one of the required options will return an error indicating which value -is missing or invalid. - -Once you have collected all of the necessary information the command for -creating a deployment is simply supplying the values along with the appropriate -flags. This command will create a deployment: - -```bash -oasisctl create deployment \ - --region-id gcp-us-west2 \ - --node-size-id c4-a4 \ - --node-disk-size 10 \ - --version 3.9.2 \ - --cacertificate-id OasisCert \ - --organization-id "ArangoGraph Organization" \ - --project-id "Getting Started with ArangoGraph" \ - --name "First Oasisctl Deployment" \ - --description "The first deployment created using the awesome Oasisctl utility!" -``` - -If everything went according to play you should see similar output: - -![Deployment created successfully](../../../images/oasisctl-create-first-deployment-success.png) - -### Wait on Deployment Status - -When you create a deployment it begins the process of _bootstrapping_ which is -getting the deployment ready for use. This should happen quickly and to see if -it is ready for use you can run the wait command using the ID of the newly -created deployment, shown at the top of the information you received above. - -```bash -oasisctl wait deployment \ - --deployment-id hmkuedzw9oavvjmjdo0i -``` - -Once you receive a response of _Deployment Ready_, your deployment is indeed -ready to use. You can get some new details by running the get command. - -```bash -oasisctl get deployment \ - --organization-id "ArangoGraph Organization" \ - --deployment-id hmkuedzw9oavvjmjdo0i -``` - -![Get deployment bootstrap status](../../../images/oasisctl-get-first-deployment-bootstrapped.png) - -Once the deployment is ready you will get two new pieces of information, the -endpoint URL and Bootstrapped-At will indicate the time it became available. -If you would like to login to the web interface to verify that your server is in fact -up and running you will need to supply the `--show-root-password` flag along -with the get command, this flag does not take a value. - -### The Update Command - -The inevitable time comes when something about your deployment must change and -this is where the update command comes in. You can use update to change or -update a number of things including updating the groups, policies, and roles -for user access control. You can also update some of your deployment -information or, for our situation, add an IP Allowlist if you didn’t add one -during creation. - -There are, of course, many options available and it is always recommended to -start with the --help flag to read about all of them. - -### Update a Deployment - -This section will show an example of how to update a deployment to use a -pre-existing allowlist. To add an IP Allowlist after the fact we are really -just updating the IP Allowlist value, which is currently empty. In order to -update the IP Allowlist of a deployment you must create a allowlist and then -you can simply reference its id like so: - -```bash -oasisctl update deployment \ - --deployment-id hmkuedzw9oavvjmjdo0i \ - --ipallowlist-id abc123AllowlistID -``` - -You should receive a response with the deployment information and an indication -that deployment was updated at the top. - -You can use the update command to update everything about your deployments as -well. If you run: - -```bash -oasisctl update deployment --help -``` - -You will see the full list of options available that will allow you to scale -your deployment as needed. - -![Update deployment help output](../../../images/oasisctl-update-deployment-help.png) - -## Delete a Deployment - -There may come a day where you need to delete a resource. The process for this -follows right along with the conventions for the other commands detailed -throughout this guide. - -### The Delete Command - -For the final example in this guide we will delete the deployment that has -been created. This only requires the deployment ID and the permissions to -delete the deployment. - -```bash -oasisctl delete deployment \ - --deployment-id hmkuedzw9oavvjmjdo0i -``` - -Once the deployment has been deleted you can confirm it is gone by listing -your deployments. - -```bash -oasisctl list deployments \ - --organization-id "ArangoGraph Organization" \ - --format json -``` - -## Next Steps - -As promised, this guide covered the basics of using Oasisctl with the ArangoDB -API. While we primarily focused on viewing and managing deployments there is -also a lot more to explore, including: - -- Organization Invites Management -- Backups -- API Key Management -- Certificate Management -- User Access Control - -You can check out all these features and further details on the ones discussed -in this guide in the documentation. diff --git a/site/content/3.10/arangograph/backups.md b/site/content/3.10/arangograph/backups.md deleted file mode 100644 index e4adcd0a0e..0000000000 --- a/site/content/3.10/arangograph/backups.md +++ /dev/null @@ -1,172 +0,0 @@ ---- -title: Backups in ArangoGraph -menuTitle: Backups -weight: 50 -description: >- - You can manually create backups or use a backup policy to schedule periodic - backups, and both ways allow you to store your backups in multiple regions simultaneously ---- -## How to create backups - -To backup data in ArangoGraph for an ArangoDB installation, navigate to the -**Backups** section of your deployment created previously. - -![Backup ArangoDB](../../images/arangograph-backup-section.png) - -There are two ways to create backups. Create periodic backups using a -**Backup policy**, or create a backup manually. -Both ways allow you to create [backups in multiple regions](#multi-region-backups) -as well. - -### Periodic backups - -Periodic backups are created at a given schedule. To see when the new backup is -due, observe the schedule section. - -![Backup Policy schedule](../../images/arangograph-backup-policy-schedule.png) - -When a new deployment is created, a default **Backup policy** is created for it -as well. This policy creates backups every two hours. To edit this policy -(or any policy), highlight it in the row above and hit the pencil icon. - -![Edit Backup Policy](../../images/arangograph-edit-backup-policy.png) - -These backups are not automatically uploaded. To enable this, use the -**Upload backup to storage** option and choose a retention period that -specifies how long backups are retained after creation. - -If the **Upload backup to storage** option is enabled for a backup policy, -you can then create backups in different regions than the default one. -The regions where the default backup is copied are shown in the -**Additional regions** column in the **Policies** section. - -### Manual backups - -It's also possible to create a backup on demand. To do this, click **Back up now**. - -![Back up Now](../../images/arangograph-back-up-now.png) - -![Back up Now Dialog](../../images/arangograph-back-up-now-dialog.png) - -If you want to manually copy a backup to a different region than the default -one, first ensure that the **Upload backup to storage** option is enabled. -Then, highlight the backup row and use the -**Copy backup to a different region** button from the **Actions** column. - -The source backup ID from -which the copy is created is displayed in the **Copied from Backup** column. - -![Copy backup to a different region](../../images/arangograph-copy-backup-different-region.png) - -![Multiple Backups](../../images/arangograph-multiple-backups.png) - -### Uploading backups - -By default, a backup is not uploaded to the cloud, instead it remains on the -servers of the deployment. To make a backup that is resilient against server -(disk) failures, upload the backup to cloud storage. - -When the **Upload backup to cloud storage** option is enabled, the backup is -preserved for a long time and does not occupy any disk space on the servers. -This also allows copying the backup to different regions and it can be -configured in the **Multiple region backup** section. - -Uploaded backups are -required for [cloning](#how-to-clone-deployments-using-backups). - -#### Best practices for uploading backups - -When utilizing the **Upload backup to cloud storage** feature, a recommended -approach is to implement a backup strategy that balances granularity and storage -efficiency. - -One effective strategy involves creating a combination of backup intervals and -retention periods. For instance, consider the following example: - -1. Perform a backup every 4 hours with a retention period of 24 hours. This - provides frequent snapshots of your data, allowing you to recover recent - changes. -2. Perform a backup every day with a retention period of a week. Daily backups - offer a broader time range for recovery, enabling you to restore data from - any point within the past week. -3. Perform a backup every week with a retention period of a month. Weekly - backups allow you to recover from more extensive data. -4. Perform a backup every month with a retention period of a year. Monthly - backups provide a long-term perspective, enabling you to restore data from - any month within the past year. - -This backup strategy offers good granularity, providing multiple recovery -options for different timeframes. By implementing this approach, you have a -total number of backups that is considerable lower in comparison to other -alternatives such as having hourly backups with a retention period of a year. - -## Multi-region backups - -Using the multi-region backup feature, you can store backups in multiple regions -simultaneously either manually or automatically as part of a **Backup policy**. -If a backup created in one region goes down, it is still available in other -regions, significantly improving reliability. - -Multiple region backup is only available when the -**Upload backup to cloud storage** option is enabled. - -![Multiple Region Backup](../../images/arangograph-multi-region-backup.png) - -## How to restore backups - -To restore a database from a backup, highlight the desired backup and click the restore icon. - -{{< warning >}} -All current data will be lost when restoring. To make sure that new data that -has been inserted after the backup creation is also restored, create a new -backup before using the **Restore Backup** feature. - -During restore, the deployment is temporarily not available. -{{< /warning >}} - -![Restore From Backup](../../images/arangograph-restore-from-backup.png) - -![Restore From Backup Dialog](../../images/arangograph-restore-from-backup-dialog.png) - -![Restore From Backup Status Pending](../../images/arangograph-restore-from-backup-status-pending.png) - -![Restore From Backup Status Restored](../../images/arangograph-restore-from-backup-status-restored.png) - -## How to clone deployments using backups - -Creating a deployment from a backup allows you to duplicate an existing -deployment with all its data, for example, to create a test environment or to -move to a different cloud provider or region within ArangoGraph. - -{{< info >}} -This feature is only available if the backup you wish to clone has been -uploaded to cloud storage. -{{< /info >}} - -{{< info >}} -The cloned deployment will have the exact same features as the previous -deployment including node size and model. The cloud provider and the region -can stay the same or you can select a different one. -For restoring a deployment as quick as possible, it is recommended to create a -deployment in the same region as where the backup resides to avoid cross-region -data transfer. -The data contained in the backup will be restored to this new deployment. - -The *root password* for this deployment will be different. -{{< /info >}} - -1. Highlight the backup you wish to clone from and hit **Clone backup to new deployment**. - - ![ArangoGraph Clone Deployment From Backup](../../images/arangograph-clone-deployment-from-backup.png) - -2. Choose whether the clone should be created using the current provider and in - the same region as the backup or using a different provider, a different region, - or both. - - ![ArangoGraph Clone Deployment Select Region](../../images/arangograph-clone-deployment-select.png) - -3. The view should navigate to the new deployment being bootstrapped. - - ![ArangoGraph Cloned Deployment](../../images/arangograph-cloned-deployment.png) - -This feature is also available through [oasisctl](oasisctl/_index.md). diff --git a/site/content/3.10/arangograph/data-loader/_index.md b/site/content/3.10/arangograph/data-loader/_index.md deleted file mode 100644 index 38f96ab442..0000000000 --- a/site/content/3.10/arangograph/data-loader/_index.md +++ /dev/null @@ -1,70 +0,0 @@ ---- -title: Load your data into ArangoGraph -menuTitle: Data Loader -weight: 22 -description: >- - Load your data into ArangoGraph and transform it into richly-connected graph - structures, without needing to write any code or deploy any infrastructure ---- - -ArangoGraph provides different ways of loading your data into the platform, -based on your migration use case. - -## Transform data into a graph - -The ArangoGraph Data Loader allows you to transform existing data from CSV file -formats into data that can be analyzed by the ArangoGraph platform. - -You provide your data in CSV format, a common format used for exports of data -from various systems. Then, using a no-code editor, you can model the schema of -this data and the relationships between them. This allows you to ingest your -existing datasets into your ArangoGraph database, without the need for any -development effort. - -You can get started in a few easy steps. - -{{< tabs "data-loader-steps" >}} - -{{< tab "1. Create database" >}} -Choose an existing database or create a new one and enter a name for your new graph. -{{< /tab >}} - -{{< tab "2. Add files" >}} -Drag and drop your data files in CSV format. -{{< /tab >}} - -{{< tab "3. Design your graph" >}} -Model your graph schema by adding nodes and connecting them via edges. -{{< /tab >}} - -{{< tab "4. Import data" >}} -Once you are ready, save and start the import. The resulting graph is an -[EnterpriseGraph](../../graphs/enterprisegraphs/_index.md) with its -corresponding collections, available in your ArangoDB web interface. -{{< /tab >}} - -{{< /tabs >}} - -Follow this [working example](../data-loader/example.md) to see how easy it is -to transform existing data into a graph. - -## Import data to the cloud - -To import data from various files into collections **without creating a graph**, -get the ArangoDB client tools for your operating system from the -[download page](https://arangodb.com/download-major/). - -- To import data to ArangoGraph from an existing ArangoDB instance, see - [arangodump](../../components/tools/arangodump/) and - [arangorestore](../../components/tools/arangorestore/). -- To import pre-existing data in JSON, CSV, or TSV format, see - [arangoimport](../../components/tools/arangoimport/). - -## How to access the Data Loader - -1. If you do not have a deployment yet, [create a deployment](../deployments/_index.md#how-to-create-a-new-deployment) first. -2. Open the deployment you want to load data into. -3. In the **Load Data** section, click the **Load your data** button. -4. Select your migration use case. - -![ArangoGraph Data Loader Migration Use Cases](../../../images/arangograph-data-loader-migration-use-cases.png) \ No newline at end of file diff --git a/site/content/3.10/arangograph/data-loader/add-files.md b/site/content/3.10/arangograph/data-loader/add-files.md deleted file mode 100644 index 114b588e40..0000000000 --- a/site/content/3.10/arangograph/data-loader/add-files.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: Add files into Data Loader -menuTitle: Add files -weight: 5 -description: >- - Provide your set of files in CSV format containing the data to be imported ---- - -The Data Loader allows you to upload your data files in CSV format into -ArangoGraph and then use these data sources to design a graph using the -built-in graph designer. - -## Upload your files - -You can upload your CSV files in the following ways: - -- Drag and drop your files in the designated area. -- Click the **Browse files** button and select the files you want to add. - -![ArangoGraph Data Loader Upload Files](../../../images/arangograph-data-loader-upload-files.png) - -You have the option to either upload several files collectively as a batch or -add them individually. Furthermore, you can supplement additional files later on. -After a file has been uploaded, you can expand it to preview both the header and -the first row of data within the file. - -In case you upload CSV files without fields, they will not be available for -manipulation. - -Once the files are uploaded, you can start [designing your graph](../data-loader/design-graph.md). - -### File formatting limitations - -Ensure that the files you upload are correctly formatted. Otherwise, errors may -occur, the upload may fail, or the data may not be correctly mapped. - -The following restrictions and limitations apply: - -- The only supported file format is CSV. If you submit an invalid file format, - the upload of that specific file will be prevented. -- It is required that all CSV files have a header row. If you upload a file - without a header, the first row of data is treated as the header. To avoid - losing the first row of the data, make sure to include headers in your files. -- The CSV file should have unique header names. It is not possible to have two - columns with the same name within the same file. - -For more details, see the [File validation](../data-loader/import.md#file-validation) section. - -### Upload limits - -Note that there is a cumulative file upload limit of 1GB. This means that the -combined size of all files you upload should not exceed 1GB. If the total size -of the uploaded files surpasses this limit, the upload may not be successful. - -## Delete files - -You can remove uploaded files by clicking the **Delete file** button in the -**Your files** panel. Please keep in mind that in order to delete a file, -you must first remove all graph associations associated with it. \ No newline at end of file diff --git a/site/content/3.10/arangograph/data-loader/design-graph.md b/site/content/3.10/arangograph/data-loader/design-graph.md deleted file mode 100644 index b1c5eaf3af..0000000000 --- a/site/content/3.10/arangograph/data-loader/design-graph.md +++ /dev/null @@ -1,68 +0,0 @@ ---- -title: Design your graph -menuTitle: Design graph -weight: 10 -description: >- - Design your graph database schema using the integrated graph modeler in the Data Loader ---- - -Based on the data you have uploaded, you can start designing your graph. -The graph designer allows you to create a schema using nodes and edges. -Once this is done, you can save and start the import. The resulting -[EnterpriseGraph](../../graphs/enterprisegraphs/_index.md) and the -corresponding collections are created in your ArangoDB database instance. - -## How to add a node - -Nodes are the main objects in your data model and include the attributes of the -objects. - -1. To create a new node, click the **Add node** button. -2. In the graph designer, click on the newly created node to view the **Node details**. -3. In the **Node details** panel, fill in the following fields: - - For **Node label**, enter a name you want to use for the node. - - For **File**, select a file from the list to associate it with the node. - - For **Primary Identifier**, select a field from the list. This is used to - reference the nodes when you define relations with edges. - - For **File Headers**, select one or more attributes from the list. - -![ArangoGraph Data Loader Add Node](../../../images/arangograph-data-loader-add-node.png) - -## How to connect nodes - -Nodes can be connected by edges to express and categorize the relations between -them. A relation always has a direction, going from one node to another. You can -define this direction in the graph designer by dragging your cursor from one -particular node to another. - -To connect two nodes, you can use the **Connect node(s)** button. Click on any -node to self-reference it or drag it to connect it to another node. Alternatively, -when you select a node, a plus sign will appear, allowing you to directly add a -new node with an edge. - -{{< tip >}} -To quickly recenter your elements on the canvas, you can use the **Center View** -button located in the bottom right corner. This brings your nodes and edges back -into focus. -{{< /tip >}} - -The edge needs to be associated with a file and must have a label. Note that a -node and an edge cannot have the same label. - -See below the steps to add details to an edge. - -1. Click on an edge in the graph designer. -2. In the **Edit Edge** panel, fill in the following fields: - - For **Edge label**, enter a name you want to use for the edge. - - For **Relation file**, select a file from the list to associate it with the edge. - - To define how the relation points from one node to another, select the - corresponding relation file header for both the origin file (`_from`) and the - destination file (`_to`). - - For **File Headers**, select one or more attributes from the list. - -![ArangoGraph Data Loader Edit Edge](../../../images/arangograph-data-loader-edit-edge.png) - -## How to delete elements - -To remove a node or an edge, simply select it in the graph designer and click the -**Delete** icon. \ No newline at end of file diff --git a/site/content/3.10/arangograph/data-loader/example.md b/site/content/3.10/arangograph/data-loader/example.md deleted file mode 100644 index 46fdd1b38e..0000000000 --- a/site/content/3.10/arangograph/data-loader/example.md +++ /dev/null @@ -1,103 +0,0 @@ ---- -title: Data Loader Example -menuTitle: Example -weight: 20 -description: >- - Follow this complete working example to see how easy it is to transform existing - data into a graph and get insights from the connected entities ---- - -To transform your data into a graph, you need to have CSV files with entities -representing the nodes and a corresponding CSV file representing the edges. - -This example uses a sample data set of two files, `airports.csv`, and `flights.csv`. -These files are used to create a graph showing flights arriving at and departing -from various cities. -You can download the files from [GitHub](https://github.com/arangodb/example-datasets/tree/master/Data%20Loader). - -The `airports.csv` contains rows of airport entries, which are the future nodes -in your graph. The `flights.csv` contains rows of flight entries, which are the -future edges connecting the nodes. - -The whole process can be broken down into these steps: - -1. **Database and graph setup**: Begin by choosing an existing database or - create a new one and enter a name for your new graph. -2. **Add files**: Upload the CSV files to the Data Loader web interface. You can - simply drag and drop them or upload them through the file browser window. -3. **Design graph**: Design your graph schema by adding nodes and edges and map - data from the uploaded files to them. This allows creating the corresponding - documents and collections for your graph. -4. **Import data**: Import the data and start using your newly created - [EnterpriseGraph](../../graphs/enterprisegraphs/_index.md) and its - corresponding collections. - -## Step 1: Create a database and choose the graph name - -Start by creating a new database and adding a name for your graph. - -![Data Loader Example Step 1](../../../images/arangograph-data-loader-example-choose-names.png) - -## Step 2: Add files - -Upload your CSV files to the Data Loader web interface. You can drag and drop -them or upload them via a file browser window. - -![Data Loader Example Step 2](../../../images/arangograph-data-loader-example-add-files.png) - -See also [Add files into Data Loader](../data-loader/add-files.md). - -## Step 3: Design graph schema - -Once the files are added, you can start designing the graph schema. This example -uses a simple graph consisting of: -- Two nodes (`origin_airport` and `destination_airport`) -- One directed edge going from the origin airport to the destination one - representing a flight - -Click **Add node** to create the nodes and connect them with edges. - -Next, for each of the nodes and edges, you need to create a mapping to the -corresponding file and headers. - -For nodes, the **Node label** is going to be a node collection name and the -**Primary identifier** will be used to populate the `_key` attribute of documents. -You can also select any additional headers to be included as document attributes. - -In this example, two node collections have been created (`origin_airport` and -`destination_airport`) and `AirportID` header is used to create the `_key` -attribute for documents in both node collections. The header preview makes it -easy to select the headers you want to use. - -![Data Loader Example Step 3 Nodes](../../../images/arangograph-data-loader-example-map-nodes.png) - -For edges, the **Edge label** is going to be an edge collection name. Then, you -need to specify how edges will connect nodes. You can do this by selecting the -*from* and *to* nodes to give a direction to the edge. -In this example, the `source airport` header has been selected as a source and -the `destination airport` header as a target for the edge. - -![Data Loader Example Step 3 Edges](../../../images/arangograph-data-loader-example-map-edges.png) - -Note that the values of the source and target for the edge correspond to the -**Primary identifier** (`_key` attribute) of the nodes. In this case, it is the -airport code (i.e. GKA) used as the `_key` in the node documents and in the source -and destination headers to configure the edges. - -See also [Design your graph in the Data Loader](../data-loader/design-graph.md). - -## Step 4: Import and see the resulting graph - -After all the mapping is done, all you need to do is click -**Save and start import**. The report provides an overview of the files -processed and the documents created, as well as a link to your new graph. -See also [Start import](../data-loader/import.md). - -![Data Loader Example Step 4 See your new graph](../../../images/arangograph-data-loader-example-data-import.png) - -Finally, click **See your new graph** to open the ArangoDB web interface and -explore your new collections and graph. - -![Data Loader Example Step 4 Resulting graph](../../../images/arangograph-data-loader-example-resulting-graph.png) - -Happy graphing! \ No newline at end of file diff --git a/site/content/3.10/arangograph/deployments/_index.md b/site/content/3.10/arangograph/deployments/_index.md deleted file mode 100644 index b8dd98d490..0000000000 --- a/site/content/3.10/arangograph/deployments/_index.md +++ /dev/null @@ -1,301 +0,0 @@ ---- -title: Deployments in ArangoGraph -menuTitle: Deployments -weight: 20 -description: >- - How to create and manage deployments in ArangoGraph ---- -An ArangoGraph deployment is an ArangoDB cluster or single server, configured -as you choose. - -Each deployment belongs to a project, which belongs to an organization in turn. -You can have any number of deployments under one project. - -**Organizations → Projects → Deployments** - -![ArangoGraph Deployments](../../../images/arangograph-deployments-page.png) - -## How to create a new deployment - -1. If you do not have a project yet, - [create a project](../projects.md#how-to-create-a-new-project) first. -2. In the main navigation, click __Deployments__. -3. Click the __New deployment__ button. -4. Select the project you want to create the deployment for. -5. Set up your deployment. The configuration options are described below. - -{{< info >}} -Deployments contain exactly **one policy**. Within that policy, you can define -role bindings to regulate access control on a deployment level. -{{< /info >}} - -### In the **General** section - -- Enter the __Name__ and optionally a __Short description__ for the deployment. -- Select the __Provider__ and __Region__ of the provider. - {{< warning >}} - Once a deployment has been created, it is not possible to change the - provider and region anymore. - {{< /warning >}} - -![ArangoGraph New Deployment General](../../../images/arangograph-new-deployment-general.png) - -### In the **Sizing** section - -- Choose a __Model__ for the deployment: - - - __OneShard__ deployments are suitable when your data set fits in a single node. - They are ideal for graph use cases. This model has a fixed number of 3 nodes. - - - __Sharded__ deployments are suitable when your data set is larger than a single - node. The data will be sharded across multiple nodes. You can select the - __Number of nodes__ for this deployment model. The more nodes you have, the - higher the replication factor can be. - - - __Single Server__ deployments are suitable when you want to try out ArangoDB without - the need for high availability or scalability. The deployment will contain a - single server only. Your data will not be replicated and your deployment can - be restarted at any time. - -- Select a __NODE SIZE__ from the list of available options. Each option is a - combination of vCPUs, memory, and disk space per node. - -![ArangoGraph New Deployment Sizing](../../../images/arangograph-new-deployment-sizing.png) - -### In the **Advanced** section - -- Select the __DB Version__. - If you don't know which DB version to select, use the version selected by default. -- Select the desired __Support Plan__. Click the link below the field to get - more information about the different support plans. -- In the __Certificate__ field: - - The default certificate created for your project is selected automatically. - - If you have no default certificate, or want to use a new certificate, - create a new certificate by typing the desired name for it and hitting - enter or clicking __Create "\"__ when done. - - Or, if you already have multiple certificates, select the desired one. -- _Optional but strongly recommended:_ In the __IP allowlist__ field, select the - desired one in case you want to limit access to your deployment to certain - IP ranges. To create a allowlist, navigate to your project and select the - __IP allowlists__ tab. See [How to manage IP allowlists](../projects.md#how-to-manage-ip-allowlists) - for details. - {{< security >}} - For any kind of production deployment it is strongly advise to use an IP allowlist. - {{< /security >}} -- Select a __Deployment Profile__. Profile options are only available on request. - -![ArangoGraph New Deployment Advanced](../../../images/arangograph-new-deployment-advanced.png) - -### In the **Summary** panel - -1. Review the configuration, and if you're okay with the setup, press the - __Create deployment__ button. -2. You are taken to the deployment overview page. - **Note:** Your deployment is being bootstrapped at that point. This process - takes a few minutes. Once the deployment is ready, you receive a confirmation - email. - -## How to access your deployment - -1. In the main navigation, click the __Dashboard__ icon and then click __Projects__. -2. In the __Projects__ page, click the project for - which you created a deployment earlier. -3. Alternatively, you can access your deployment by clicking __Deployments__ in the - dashboard navigation. This page shows all deployments from all projects. - Click the name of the deployment you want to view. -4. For each deployment in your project, you see the status. While your new - deployment is being set up, it displays the __bootstrapping__ status. -5. Press the __View__ button to show the deployment page. -6. When a deployment displays a status of __OK__, you can access it. -7. Click the __Open database UI__ button or on the database UI link to open - the dashboard of your new ArangoDB deployment. - -At this point your ArangoDB deployment is available for you to use — **Have fun!** - -If you have disabled the [auto-login option](#auto-login-to-database-ui) to the -database web interface, you need to follow the additional steps outlined below -to access your deployment: - -1. Click the copy icon next to the root password. This copies the deployment - root password to your clipboard. You can also click the view icon to unmask - the root password to see it. - {{< security >}} - Do not use the root username/password for everyday operations. It is recommended - to use them only to create other user accounts with appropriate permissions. - {{< /security >}} -2. Click the __Open database UI__ button or on the database UI link to open - the dashboard of your new ArangoDB deployment. -3. In the __username__ field type `root`, and in the __password__ field paste the - password that you copied earlier. -4. Press the __Login__ button. -5. Press the __Select DB: \_system__ button. - -{{< info >}} -Each deployment is accessible on two ports: - -- Port `8529` is the standard port recommended for use by web-browsers. -- Port `18529` is the alternate port that is recommended for use by automated services. - -The difference between these ports is the certificate used. If you enable -__Use well-known certificate__, the certificates used on port `8529` is well-known -and automatically accepted by most web browsers. The certificate used on port -`18529` is a self-signed certificate. For securing automated services, the use of -a self-signed certificate is recommended. Read more on the -[Certificates](../security-and-access-control/x-509-certificates.md) page. -{{< /info >}} - -## Password settings - -### How to enable the automatic root user password rotation - -Password rotation refers to changing passwords regularly - a security best -practice to reduce the vulnerability to password-based attacks and exploits -by limiting for how long passwords are valid. The ArangoGraph Insights Platform -can automatically change the `root` user password of an ArangoDB deployment -periodically to improve security. - -1. Navigate to the __Deployment__ for which you want to enable an automatic - password rotation for the root user. -2. In the __Quick start__ section, click the button with the __gear__ icon next to the - __ROOT PASSWORD__. -3. In the __Password Settings__ dialog, turn the automatic password rotation on - and click the __Confirm__ button. - - ![ArangoGraph Deployment Password Rotation](../../../images/arangograph-deployment-password-rotation.png) -4. You can expand the __Root password__ panel to see when the password was - rotated last. The rotation takes place every three months. - -### Auto login to database UI - -ArangoGraph provides the ability to automatically login to your database using -your existing ArangoGraph credentials. This not only provides a seamless -experience, preventing you from having to manage multiple sets of credentials -but also improves the overall security of your database. As your credentials -are shared between ArangoGraph and your database, you can benefit from -end-to-end audit traceability for a given user, as well as integration with -ArangoGraph SSO. - -You can enable this feature in the **Password Settings** dialog. Please note -that it may take a few minutes to get activated. -Once enabled, you no longer have to fill in the `root` user and password of -your ArangoDB deployment. - -{{< info >}} -If you use the auto login feature with AWS -[private endpoints](../deployments/private-endpoints.md), it is recommended -to switch off the `custom DNS` setting. -{{< /info >}} - -This feature can be disabled at any time. You may wish to consider explicitly -disabling this feature in the following situations: -- Your workflow requires you to access the database UI using different accounts - with differing permission sets, as you cannot switch database users when - automatic login is enabled. -- You need to give individuals access to a database's UI without giving them - any access to ArangoGraph. Note, however, that it's possible to only give an - ArangoGraph user database UI access, without other ArangoGraph permissions. - -{{< warning >}} -When the auto login feature is enabled, users cannot edit their permissions on -the ArangoDB database web interface as all permissions are managed by the -ArangoGraph platform. -{{< /warning >}} - -Before getting started, make sure you are signed into ArangoGraph as a user -with one of the following permissions in your project: -- `data.deployment.full-access` -- `data.deployment.read-only-access` - -Organization owners have these permissions enabled by default. -The `deployment-full-access-user` and `deployment-read-only-user` roles which -contain these permissions can also be granted to other members of the -organization. See how to create a -[role binding](../security-and-access-control/_index.md#how-to-view-edit-or-remove-role-bindings-of-a-policy). - -{{< warning >}} -This feature is available on `443` port only. -{{< /warning >}} - -## How to edit a deployment - -You can modify a deployment's configuration, including the ArangoDB version -that is being used, change the memory size, or even switch from -a OneShard deployment to a Sharded one if your data set no longer fits in a -single node. - -{{< tip >}} -To edit an existing deployment, you must have the necessary set of permissions -attached to your role. Read more about [roles and permissions](../security-and-access-control/_index.md#roles). -{{< /tip >}} - -1. In the main navigation, click **Deployments** and select an existing - deployment from the list, or click **Projects**, select a project, and then - select a deployment. -2. In the **Quick start** section, click the **Edit** button. -3. In the **General** section, you can do the following: - - Change the deployment name - - Change the deployment description -4. In the **Sizing** section, you can do the following: - - Change **OneShard** deployments into **Sharded** deployments. To do so, - select **Sharded** in the **Model** dropdown list. You can select the - number of nodes for your deployment. This can also be modified later on. - {{< warning >}} - You cannot switch from **Sharded** back to **OneShard**. - {{< /warning >}} - - Change **Single Server** deployments into **OneShard** or **Sharded** deployments. - {{< warning >}} - You cannot switch from **Sharded** or **OneShard** back to **Single Server**. - {{< /warning >}} - - Scale up or down the node size. - {{< warning >}} - When scaling up or down the size in AWS deployments, the new value gets locked - and cannot be changed again until the cloud provider rate limit is reset. - {{< /warning >}} -5. In the **Advanced** section, you can do the following: - - Upgrade the ArangoDB version that is currently being used. See also - [Upgrades and Versioning](upgrades-and-versioning.md) - - Select a different certificate. - - Add or remove an IP allowlist. - - Select a deployment profile. -6. All changes are reflected in the **Summary** panel. Review the new - configuration and click **Save changes**. - -## How to connect a driver to your deployment - -[ArangoDB drivers](../../develop/drivers/_index.md) allow you to use your ArangoGraph -deployment as a database system for your applications. Drivers act as interfaces -between different programming languages and ArangoDB, which enable you to -connect to and manipulate ArangoDB deployments from within compiled programs -or using scripting languages. - -To get started, open a deployment. -In the **Quick start** section, click on the **Connecting drivers** button and -select your programming language. The code snippets provide examples on how to -connect to your instance. - -{{< tip >}} -Note that ArangoGraph Insights Platform runs deployments in a cluster -configuration. To achieve the best possible availability, your client -application has to handle connection failures by retrying operations if needed. -{{< /tip >}} - -![ArangoGraph Connecting Drivers Example](../../../images/arangograph-connecting-drivers-example.png) - -## How to delete a deployment - -{{< danger >}} -Deleting a deployment deletes all its data and backups. -This operation is **irreversible**. Please proceed with caution. -{{< /danger >}} - -1. In the main navigation, in the __Projects__ section, click the project that - holds the deployment you wish to delete. -2. In the __Deployments__ page, click the deployment you wish to delete. -3. Click the __Delete/Lock__ entry in the navigation. -4. Click the __Delete deployment__ button. -5. In the modal dialog, confirm the deletion by entering `Delete!` into the - designated text field. -6. Confirm the deletion by pressing the __Yes__ button. -7. You will be taken back to the deployments page of the project. - The deployment being deleted will display the __Deleting__ status until it has - been successfully removed. diff --git a/site/content/3.10/arangograph/deployments/private-endpoints.md b/site/content/3.10/arangograph/deployments/private-endpoints.md deleted file mode 100644 index 39e42514fd..0000000000 --- a/site/content/3.10/arangograph/deployments/private-endpoints.md +++ /dev/null @@ -1,221 +0,0 @@ ---- -title: Private endpoint deployments in ArangoGraph -menuTitle: Private endpoints -weight: 5 -description: >- - Use the private endpoint feature to isolate your deployments and increase - security ---- -This topic describes how to create a private endpoint deployment and -securely deploy to various cloud providers such as Google Cloud Platform (GCP), -Microsoft Azure, and Amazon Web Services (AWS). Follow the steps outlined below -to get started. - -{{< tip >}} -Private endpoints on Microsoft Azure can be cross region; in AWS they should be -located in the same region. -{{< /tip >}} - -{{< info >}} -For more information about the certificates used for private endpoints, please -refer to the [How to manage certificates](../security-and-access-control/x-509-certificates.md) -section. -{{< /info >}} - -## Google Cloud Platform (GCP) - -Google Cloud Platform (GCP) offers a feature called -[Private Service Connect](https://cloud.google.com/vpc/docs/private-service-connect) -that allows private consumption of services across VPC networks that belong to -different groups, teams, projects, or organizations. You can publish and consume -services using the defined IP addresses which are internal to your VPC network. - -In ArangoGraph, you can -[create a regular deployment](_index.md#how-to-create-a-new-deployment) -and change it to a private endpoint deployment afterwards. - -Such a deployment is not reachable from the internet anymore, other than via -the ArangoGraph dashboard to administrate it. To revert to a public deployment, -please contact support via **Request help** in the help menu. - -To configure a private endpoint for GCP, you need to provide your Google project -names. ArangoGraph then configures a **Private Endpoint Service** that automatically -connect to private endpoints that are created for those projects. - -After the creation of the **Private Endpoint Service**, you should receive a -service attachment that you need during the creation of your private endpoint(s). - -1. Open the deployment you want to change. -2. In the **Quick start** section, click the **Edit** button with an ellipsis (`…`) - icon. -3. Click **Change to private endpoint** in the menu. - ![ArangoGraph Deployment Private Endpoint Menu](../../../images/arangograph-gcp-change.png) -4. In the configuration wizard, click **Next** to enter your configuration details. -5. Enter one or more Google project names. You can also add them later in the summary view. - Click **Next**. - ![ArangoGraph Deployment Private Endpoint Setup 2](../../../images/arangograph-gcp-private-endpoint.png) -6. Configure custom DNS names. This step is optional and disabled by default. - Note that, once enabled, this setting is immutable and cannot be reverted. - Click **Next** to continue. - {{< info >}} - By default, your private endpoint is available to all VPCs that connect to it - at `https://-pe.arangodb.cloud` with the - [well-known certificate](../security-and-access-control/x-509-certificates.md#well-known-x509-certificates). - If the custom DNS is enabled, you will be responsible for the DNS of your - private endpoints. - {{< /info >}} - ![ArangoGraph Private Endpoint Custom DNS](../../../images/arangograph-gcp-custom-dns.png) -7. Click **Confirm Settings** to change the deployment. -8. Back in the **Overview** page, scroll down to the **Private Endpoint** section - that is now displayed to see the connection status and to change the - configuration. -9. ArangoGraph configures a **Private Endpoint Service**. As soon as the - **Service Attachment** is ready, you can use it to configure the Private - Service Connect in your VPC. - -{{< tip >}} -When you create a private endpoint in ArangoGraph, both endpoints (the regular -one and the new private one) are available for two hours. During this time period, -you can switch your application to the new private endpoint. After this period, -the old endpoint is not available anymore. -{{< /tip >}} - -## Microsoft Azure - -Microsoft Azure offers a feature called -[Azure Private Link](https://docs.microsoft.com/en-us/azure/private-link) -that allows you to limit communication between different Azure servers and -services to Microsoft's backbone network without exposure to the internet. -It can lower network latency and increase security. - -If you want to connect an ArangoGraph deployment running on Azure with other -services you run on Azure using such a tunnel, then -[create a regular deployment](_index.md#how-to-create-a-new-deployment) -and change it to a private endpoint deployment afterwards. - -The deployment is not reachable from the internet anymore, other than via -the ArangoGraph dashboard to administrate it. To revert to a public deployment, -please contact support via **Request help** in the help menu. - -1. Open the deployment you want to change. -2. In the **Quick start** section, click the **Edit** button with an ellipsis (`…`) - icon. -3. Click **Change to private endpoint** in the menu. - ![ArangoGraph Deployment Private Endpoint Menu](../../../images/arangograph-deployment-private-endpoint-menu.png) -4. In the configuration wizard, click **Next** to enter your configuration details. -5. Enter one or more Azure Subscription IDs (GUIDs). They cannot be - changed anymore once a connection has been established. - Proceed by clicking **Next**. - ![ArangoGraph Deployment Private Endpoint Setup 2](../../../images/arangograph-deployment-private-endpoint-setup2.png) -6. Configure custom DNS names. This step is optional and disabled by default, - you can also add or change them later from the summary view. - Click **Next** to continue. - {{< info >}} - When using custom DNS names on private endpoints running on Azure, you need - to use the [self-signed certificate](../security-and-access-control/x-509-certificates.md#self-signed-x509-certificates). - {{< /info >}} -7. Click **Confirm Settings** to change the deployment. -8. Back in the **Overview** page, scroll down to the **Private Endpoint** section - that is now displayed to see the connection status and to change the - configuration. -9. ArangoGraph configures a **Private Endpoint Service**. As soon as the **Azure alias** - becomes available, you can copy it and then go to your Microsoft Azure portal - to create Private Endpoints using this alias. The number of established - **Connections** increases and you can view the connection details by - clicking it. - -{{< tip >}} -When you create a private endpoint in ArangoGraph, both endpoints (the regular -one and the new private one) are available for two hours. During this time period, -you can switch your application to the new private endpoint. After this period, -the old endpoint is not available anymore. -{{< /tip >}} - -## Amazon Web Services (AWS) - -AWS offers a feature called [AWS PrivateLink](https://aws.amazon.com/privatelink) -that enables you to privately connect your Virtual Private Cloud (VPC) to -services, without exposure to the internet. You can control the specific API -endpoints, sites, and services that are reachable from your VPC. - -Amazon VPC allows you to launch AWS resources into a -virtual network that you have defined. It closely resembles a traditional -network that you would normally operate, with the benefits of using the AWS -scalable infrastructure. - -In ArangoGraph, you can -[create a regular deployment](_index.md#how-to-create-a-new-deployment) and change it -to a private endpoint deployment afterwards. - -The ArangoDB private endpoint deployment is not exposed to public internet -anymore, other than via the ArangoGraph dashboard to administrate it. To revert -it to a public deployment, please contact the support team via **Request help** -in the help menu. - -To configure a private endpoint for AWS, you need to provide the AWS principals related -to your VPC. The ArangoGraph Insights Platform configures a **Private Endpoint Service** -that automatically connects to private endpoints that are created in those principals. - -1. Open the deployment you want to change. -2. In the **Quick start** section, click the **Edit** button with an ellipsis (`…`) - icon. -3. Click **Change to private endpoint** in the menu. - ![ArangoGraph Deployment AWS Change to Private Endpoint](../../../images/arangograph-aws-change-to-private-endpoint.png) -4. In the configuration wizard, click **Next** to enter your configuration details. -5. Click **Add Principal** to start configuring the AWS principal(s). - You need to enter a valid account, which is your 12 digit AWS account ID. - Adding usernames or role names is optional. You can also - skip this step and add them later from the summary view. - {{< info >}} - Principals cannot be changed anymore once a connection has been established. - {{< /info >}} - {{< warning >}} - To verify your endpoint service in AWS, you must use the same principal as - configured in ArangoGraph. Otherwise, the service name cannot be verified. - {{< /warning >}} - ![ArangoGraph AWS Private Endpoint Configure Principals](../../../images/arangograph-aws-endpoint-configure-principals.png) -6. Configure custom DNS names. This step is optional and disabled by default, - you can also add or change them later from the summary view. - Click **Next** to continue. - {{< info >}} - By default, your private endpoint is available to all VPCs that connect to it - at `https://-pe.arangodb.cloud` with the well-known certificate. - If the custom DNS is enabled, you will be responsible for the DNS of your - private endpoints. - {{< /info >}} - ![ArangoGraph AWS Private Endpoint Alternate DNS](../../../images/arangograph-aws-private-endpoint-dns.png) -7. Confirm that you want to use a private endpoint for your deployment by - clicking **Confirm Settings**. -8. Back in the **Overview** page, scroll down to the **Private Endpoint** section - that is now displayed to see the connection status and change the - configuration, if needed. - ![ArangoGraph AWS Private Endpoint Overview](../../../images/arangograph-aws-private-endpoint-overview.png) - {{< info >}} - Note that - [Availability Zones](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-availability-zones) - are independently mapped for each AWS account. The physical location of a - zone may differ from one account to another account. To coordinate - Availability Zones across AWS accounts, you must use the - [Availability Zone ID](https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html). - {{< /info >}} - - {{< tip >}} - To learn more or request help from the ArangoGraph support team, click **Help** - in the top right corner of the **Private Endpoint** section. - {{< /tip >}} -9. ArangoGraph configures a **Private Endpoint Service**. As soon as this is available, - you can use it in the AWS portal to create an interface endpoint to connect - to your endpoint service. For more details, see - [How to connect to an endpoint](https://docs.aws.amazon.com/vpc/latest/privatelink/create-endpoint-service.html#share-endpoint-service). - -{{< tip >}} -To establish connectivity and enable traffic flow, make sure you add a route -from the originating machine to the interface endpoint. -{{< /tip >}} - -{{< tip >}} -When you create a private endpoint in ArangoGraph, both endpoints (the regular -one and the new private one) are available for two hours. During this time period, -you can switch your application to the new private endpoint. After this period, -the old endpoint is not available anymore. -{{< /tip >}} diff --git a/site/content/3.10/arangograph/migrate-to-the-cloud.md b/site/content/3.10/arangograph/migrate-to-the-cloud.md deleted file mode 100644 index 8a3f4a9802..0000000000 --- a/site/content/3.10/arangograph/migrate-to-the-cloud.md +++ /dev/null @@ -1,259 +0,0 @@ ---- -title: Cloud Migration Tool -menuTitle: Migrate to the cloud -weight: 30 -description: >- - Migrating data from bare metal servers to the cloud with minimal downtime -draft: true ---- -The `arangosync-migration` tool allows you to easily move from on-premises to -the cloud while ensuring a smooth transition with minimal downtime. -Start the cloud migration, let the tool do the job and, at the same time, -keep your local cluster up and running. - -Some of the key benefits of the cloud migration tool include: -- Safety comes first - pre-checks and potential failures are carefully handled. -- Your data is secure and fully encrypted. -- Ease-of-use with a live migration while your local cluster is still in use. -- Get access to what a cloud-based fully managed service has to offer: - high availability and reliability, elastic scalability, and much more. - -## Downloading the tool - -The `arangosync-migration` tool is available to download for the following -operating systems: - -**Linux** -- [AMD64 (x86_64) architecture](https://download.arangodb.com/arangosync-migration/linux/amd64/arangosync-migration) -- [ARM64 (AArch64) architecture](https://download.arangodb.com/arangosync-migration/linux/arm64/arangosync-migration) - -**macOS / Darwin** -- [AMD64 (x86_64) architecture](https://download.arangodb.com/arangosync-migration/darwin/amd64/arangosync-migration) -- [ARM64 (AArch64) architecture](https://download.arangodb.com/arangosync-migration/darwin/arm64/arangosync-migration) - -**Windows** -- [AMD64 (x86_64) architecture](https://download.arangodb.com/arangosync-migration/windows/amd64/arangosync-migration.exe) -- [ARM64 (AArch64) architecture](https://download.arangodb.com/arangosync-migration/windows/arm64/arangosync-migration.exe) - -For macOS as well as other Unix-based operating systems, run the following -command to make sure you can execute the binary: - -```bash -chmod 755 ./arangosync-migration -``` - -## Prerequisites - -Before getting started, make sure the following prerequisites are in place: - -- Go to the [ArangoGraph Insights Platform](https://dashboard.arangodb.cloud/home) - and sign in. If you don’t have an account yet, sign-up to create one. - -- Generate an ArangoGraph API key and API secret. See a detailed guide on - [how to create an API key](api/set-up-a-connection.md#creating-an-api-key). - -{{< info >}} -The cloud migration tool is only available for clusters. -{{< /info >}} - -### Setting up the target deployment in ArangoGraph - -Continue by [creating a new ArangoGraph deployment](deployments/_index.md#how-to-create-a-new-deployment) -or choose an existing one. - -The target deployment in ArangoGraph requires specific configuration rules to be -set up before the migration can start: - -- **Configuration settings**: The target deployment must be compatible with the - source data cluster. This includes the ArangoDB version that is being used, - the DB-Servers count, and disk space. -- **Deployment region and cloud provider**: Choose the closest region to your - data cluster. This factor can speed up your migration to the cloud. - -After setting up your ArangoGraph deployment, wait for a few minutes for it to become -fully operational. - -{{< info >}} -Note that Developer mode deployments are not supported. -{{< /info >}} - -## Running the migration tool - -The `arangosync-migration` tool provides a set of commands that allow you to: -- start the migration process -- check whether your source and target clusters are fully compatible -- get the current status of the migration process -- stop or abort the migration process -- switch the local cluster to read-only mode - -### Starting the migration process - -To start the migration process, run the following command: - -```bash -arangosync-migration start -``` -The `start` command runs some pre-checks. Among other things, it measures -the disk space which is occupied by your ArangoDB cluster. If you are using the -same data volume for ArangoDB servers and other data as well, the measurements -can be incorrect. Provide the `--source.ignore-metrics` option to overcome this. - -You also have the option of doing a `--check-only` without starting the actual -migration. If specified, this checks if your local cluster and target deployment -are compatible without sending any data to ArangoGraph. - -Once the migration starts, the local cluster enters into monitoring mode and the -synchronization status is displayed in real-time. If you don't want to see the -status you can terminate this process, as the underlying agent process -continues to work. If something goes wrong, restarting the same command restores -the replication state. - -To restart the migration, first `stop` or `stop --abort` the migration. Then, -start it again using the `start` command. - -{{< warning >}} -Starting the migration creates a full copy of all data from the source cluster -to the target deployment in ArangoGraph. All data that has previously existed in the -target deployment will be lost. -{{< /warning >}} - -### During the migration - -The following takes place during an active migration: -- The source data cluster remains usable. -- The target deployment in ArangoGraph is switched to read-only mode. -- Your root user password is not copied to the target deployment in ArangoGraph. - To get your root password, select the target deployment from the ArangoGraph - Dashboard and go to the **Overview** tab. All other users are fully synchronized. - -{{< warning >}} -The migration tool increases the CPU and memory usage of the server you are -running it on. Depending on your ArangoDB usage pattern, it may take a lot of CPU -to handle the replication. You can stop the migration process anytime -if you see any problems. -{{< /warning >}} - -```bash -./arangosync-migration start \ - --source.endpoint=$COORDINATOR_ENDPOINT \ - --source.jwt-secret=/path-to/jwt-secret.file \ - --arango-graph.api-key=$ARANGO_GRAPH_API_KEY \ - --arango-graph.api-secret=$ARANGO_GRAPH_API_SECRET \ - --arango-graph.deployment-id=$ARANGO_GRAPH_DEPLOYMENT_ID -``` - -### How long does it take? - -The total time required to complete the migration depends on how much data you -have and how often write operations are executed during the process. - -You can also track the progress by checking the **Migration status** section of -your target deployment in ArangoGraph dashboard. - -![ArangoGraph Cloud Migration Progress](../../images/arangograph-migration-agent.png) - -### Getting the current status - -To print the current status of the migration, run the following command: - -```bash -./arangosync-migration status \ - --arango-graph.api-key=$ARANGO_GRAPH_API_KEY \ - --arango-graph.api-secret=$ARANGO_GRAPH_API_SECRET \ - --arango-graph.deployment-id=$ARANGO_GRAPH_DEPLOYMENT_ID -``` - -You can also add the `--watch` option to start monitoring the status in real-time. - -### Stopping the migration process - -The `arangosync-migration stop` command stops the migration and terminates -the migration agent process. - -If replication is running normally, the command waits until all shards are -in sync. The local cluster is then switched into read-only mode. -After all shards are in-sync and the migration stopped, the target deployment -is switched into the mode specified in `--source.server-mode` option. If no -option is specified, it defaults to the read/write mode. - -```bash -./arangosync-migration stop \ - --arango-graph.api-key=$ARANGO_GRAPH_API_KEY \ - --arango-graph.api-secret=$ARANGO_GRAPH_API_SECRET \ - --arango-graph.deployment-id=$ARANGO_GRAPH_DEPLOYMENT_ID -``` - -The additional `--abort` option is supported. If specified, the `stop` command -will not check anymore if both deployments are in-sync and stops all -migration-related processes as soon as possible. - -### Switching the local cluster to read-only mode - -The `arangosync-migration set-server-mode` command allows switching -[read-only mode](../develop/http-api/administration.md#set-the-server-mode-to-read-only-or-default) -for your local cluster on and off. - -In a read-only mode, all write operations are going to fail with an error code -of `1004` (ERROR_READ_ONLY). -Creating or dropping databases and collections are also going to fail with -error code `11` (ERROR_FORBIDDEN). - -```bash -./arangosync-migration set-server-mode \ - --source.endpoint=$COORDINATOR_ENDPOINT \ - --source.jwt-secret=/path-to/jwt-secret.file \ - --source.server-mode=readonly -``` -The `--source.server-mode` option allows you to specify the desired server mode. -Allowed values are `readonly` or `default`. - -### Supported environment variables - -The `arangosync-migration` tool supports the following environment variables: - -- `$ARANGO_GRAPH_API_KEY` -- `$ARANGO_GRAPH_API_SECRET` -- `$ARANGO_GRAPH_DEPLOYMENT_ID` - -Using these environment variables is highly recommended to ensure a secure way -of providing sensitive data to the application. - -### Restrictions and limitations - -When running the migration, ensure that your target deployment has the same (or -bigger) amount of resources (CPU, RAM) than your cluster. Otherwise, the -migration process might get stuck or require manual intervention. This is closely -connected to the type of data you have and how it is distributed between shards -and collections. - -In general, the most important parameters are: -- Total number of leader shards -- The amount of data in bytes per collection - -Both parameters can be retrieved from the ArangoDB Web Interface. - -The `arangosync-migration` tool supports migrating large datasets of up to -5 TB of data and 3800 leader shards, as well as collections as big as 250 GB. - -In case you have any questions, please -[reach out to us](https://www.arangodb.com/contact). - -## Cloud migration workflow for minimal downtime - -1. Download and start the `arangosync-migration` tool. The target deployment - is switched into read-only mode automatically. -2. Wait until all shards are in sync. You can use the `status` or the `start` - command with the same parameters to track that. -3. Optionally, when all shards are in-sync, you can switch your applications - to use the endpoint of the ArangoGraph deployment, but note that it stays in - read-only mode until the migration process is fully completed. -4. Stop the migration using the `stop` subcommand. The following steps are executed: - - The source data cluster is switched into read-only mode. - - It waits until all shards are synchronized. - - The target deployment is switched into default read/write mode. - - {{< info >}} - If you switched the source data cluster into read-only mode, - you can switch it back to default (read/write) mode using the - `set-server-mode` subcommand. - {{< /info >}} diff --git a/site/content/3.10/arangograph/monitoring-and-metrics.md b/site/content/3.10/arangograph/monitoring-and-metrics.md deleted file mode 100644 index 2b9ede4b4a..0000000000 --- a/site/content/3.10/arangograph/monitoring-and-metrics.md +++ /dev/null @@ -1,137 +0,0 @@ ---- -title: Monitoring & Metrics in ArangoGraph -menuTitle: Monitoring & Metrics -weight: 40 -description: >- - ArangoGraph provides various built-in tools and integrations to help you - monitor your deployment ---- -The ArangoGraph Insights Platform provides integrated charts, metrics, and logs -to help you monitor your deployment. This allows you to track your deployment's -performance, resource utilization, and its overall status. - -The key features include: -- **Built-in monitoring**: Get immediate access to monitoring capabilities for - your deployments without any additional setup. -- **Chart-based metrics representation**: Visualize the usage of the DB-Servers - and Coordinators over a selected timeframe. -- **Integration with Prometheus and Grafana**: Connect your metrics to Prometheus - and Grafana for in-depth visualization and analysis. - -To get started, select an existing deployment from within a project and -click **Monitoring** in the navigation. - -![ArangoGraph Monitoring tab](../../images/arangograph-monitoring-tab.png) - -## Built-in monitoring and metrics - -### In the **Servers** section - -The **Servers** section offers an overview of the DB-Servers, Coordinators, -and Agents used in your deployment. It provides essential details such as each -server's ID and type, the running ArangoDB version, as well as their memory, -CPU, and disk usage. - -In case you need to perform a restart on a server, you can do so by using the -**Gracefully restart this server** action button. This shuts down all services -normally, allowing ongoing operations to finish gracefully before the restart -occurs. - -Additionally, you can access detailed logs via the **Logs** button. This allows -you to apply filters to obtain logs from all server types or select specific ones -(i.e. only Coordinators or only DB-Servers) within a timeframe. To download the -logs, click the **Save** button. - -![ArangoGraph Monitoring Servers](../../images/arangograph-monitoring-servers.png) - -### In the **Metrics** section - -The **Metrics** section displays a chart-based representation depicting the -resource utilization of DB-Servers and Coordinators within a specified timeframe. - -You can select one or more DB-Servers and choose **CPU**, **Memory**, or **Disk** -to visualize their respective usage. The search box enables you to easily find -a server by its ID, particularly useful when having a large number of servers -or when needing to quickly find a particular one among many. - -Similarly, you can repeat the process for Coordinators to see the **CPU** and -**Memory** usage. - -![Arangograph Monitoring Metrics Chart](../../images/arangograph-monitoring-metrics-chart.png) - -## Connect with Prometheus and Grafana - -The ArangoGraph Insights Platform provides metrics for each deployment in a -[Prometheus](https://prometheus.io/)-compatible format. -You can use these metrics to gather detailed insights into the current -and previous states of your deployment. -Once metrics are collected by Prometheus, you can inspect them using tools -such as [Grafana](https://grafana.com/oss/grafana/). - -![ArangoGraph Connect Metrics Section](../../images/arangograph-connect-metrics-section.png) - -### Metrics tokens - -The **Metrics tokens** section allows you to create a new metrics token, -which is required for connecting to Prometheus. - -1. To create a metrics token, click **New metrics token**. -2. For **Name**, enter a name for the metrics token. -3. Optionally, you can also enter a **Short description**. -4. Select the **Lifetime** of the metrics token. -5. Click **Create**. - -![ArangoGraph Metrics Tokens](../../images/arangograph-metrics-token.png) - -### How to connect Prometheus - -1. In the **Metrics** section, click **Connect Prometheus**. -2. Create the `prometheus.yml` file with the following content: - ```yaml - global: - scrape_interval: 60s - scrape_configs: - - job_name: 'deployment' - bearer_token: '' - scheme: 'https' - static_configs: - - targets: ['6775e7d48152.arangodb.cloud:8829'] - tls_config: - insecure_skip_verify: true - ``` -3. Start Prometheus with the following command: - ```sh - docker run -d \ - -p 9090:9090 -p 3000:3000 --name prometheus \ - -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml:ro \ - prom/prometheus - ``` - {{< info >}} - This command also opens a port 3000 for Grafana. In a production environment, - this is not needed and not recommended to have it open. - {{< /info >}} - -### How to connect Grafana - -1. Start Grafana with the following command: - ```sh - docker run -d \ - --network container:prometheus \ - grafana/grafana - ``` -2. Go to `localhost:3000` and log in with the following credentials: - - For username, enter *admin*. - - For password, enter *admin*. - - {{< tip >}} - After the initial login, make sure to change your password. - {{< /tip >}} - -3. To add a data source, click **Add your first data source** and then do the following: - - Select **Prometheus**. - - For **HTTP URL**, enter `http://localhost:9090`. - - Click **Save & Test**. -4. To add a dashboard, open the menu and click **Create** and then **Import**. -5. Download the [Grafana dashboard for ArangoGraph](https://github.com/arangodb-managed/grafana-dashboards). -6. Copy the contents of the `main.json` file into the **Import via panel json** field in Grafana. -7. Click **Load**. diff --git a/site/content/3.10/arangograph/my-account.md b/site/content/3.10/arangograph/my-account.md deleted file mode 100644 index e79415060a..0000000000 --- a/site/content/3.10/arangograph/my-account.md +++ /dev/null @@ -1,171 +0,0 @@ ---- -title: My Account in ArangoGraph -menuTitle: My Account -weight: 35 -description: >- - How to manage your user account, your organizations, and your API keys in ArangoGraph ---- -You can access information related to your account via the __User Toolbar__. -The toolbar is in the top right corner in the ArangoGraph dashboard and -accessible from every view. There are two elements: - -- __Question mark icon__: Help -- __User icon__: My Account - -![ArangoGraph My Account](../../images/arangograph-my-account.png) - -## Overview - -### How to view my account - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __Overview__ in the __My account__ section. -3. The __Overview__ displays your name, email address, company and when the - account was created. - -### How to edit the profile of my account - -1. Hover over or click the user icon in the __User Toolbar__ in the top right corner. -2. Click __Overview__ in the __My account__ section. -3. Click the __Edit__ button. -4. Change your personal information and __Save__. - -![ArangoGraph My Account Info](../../images/arangograph-my-account-info.png) - -## Organizations - -### How to view my organizations - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __My organizations__ in the __My account__ section. -3. Your organizations are listed in a table. - Click the organization name or the eye icon in the __Actions__ column to - jump to the organization overview. - -### How to create a new organization - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __My organizations__ in the __My account__ section. -3. Click the __New organization__ button. -4. Enter a name and and a description for the new organization and click the - __Create__ button. - -{{< info >}} -The free to try tier is limited to a single organization. -{{< /info >}} - -### How to delete an organization - -{{< danger >}} -Removing an organization implies the deletion of projects and deployments. -This operation cannot be undone and **all deployment data will be lost**. -Please proceed with caution. -{{< /danger >}} - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __My organizations__ in the __My account__ section. -3. Click the __recycle bin__ icon in the __Actions__ column. -4. Enter `Delete!` to confirm and click __Yes__. - -{{< info >}} -If you are no longer a member of any organization, then a new organization is -created for you when you log in again. -{{< /info >}} - -![ArangoGraph New Organization](../../images/arangograph-new-org.png) - -## Invites - -Invitations are requests to join organizations. You can accept or reject -pending invites. - -### How to create invites - -See [Users and Groups: How to add a new member to the organization](organizations/users-and-groups.md#how-to-add-a-new-member-to-the-organization) - -### How to respond to my invites - -#### I am not a member of an organization yet - -1. Once invited, you will receive an email asking to join your - ArangoGraph organization. - ![ArangoGraph Organization Invite Email](../../images/arangograph-org-invite-email.png) -2. Click the __View my organization invite__ link in the email. You will be - asked to log in or to create a new account. -3. To sign up for a new account, click the __Start Free__ button or the - __Sign up__ link in the header navigation. - ![ArangoGraph Homepage](../../images/arangograph-homepage.png) -4. After successfully signing up, you will receive a verification email. -5. Click the __Verify my email address__ link in the email. It takes you back - to the ArangoGraph Insights Platform site. - ![ArangoGraph Organization Invite Email Verify](../../images/arangograph-org-invite-email-verify.png) -6. After successfully logging in, you can accept or reject the invite to - join your organization. - ![ArangoGraph Organization Invite](../../images/arangograph-org-invite.png) -7. After accepting the invite, you become a member of your organization and - will be granted access to the organization and its related projects and - deployments. - -#### I am already a member of an organization - -1. Once invited, you will receive an email asking to join your - ArangoGraph organization, as well as a notification in the ArangoGraph dashboard. -2. Click the __View my organization invites__ link in the email, or hover over the - user icon in the top right corner of the dashboard and click - __My organization invites__. - ![ArangoGraph Organization Invite Notification](../../images/arangograph-org-invite-notification.png) -3. On the __Invites__ tab of the __My account__ view, you can accept or reject - pending invitations, as well as see past invitations that you accepted or - rejected. Click the button with a checkmark icon to join the organization. - ![ArangoGraph Organization Invites Accept](../../images/arangograph-org-invites-accept.png) - -## API Keys - -API keys are authentication tokens intended to be used for scripting. -They allow a script to authenticate on behalf of a user. - -An API key consists of a key and a secret. You need both to complete -authentication. - -### How to view my API keys - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __My API keys__ in the __My account__ section. -3. Information about the API keys are listed in the __My API keys__ section. - -![ArangoGraph My API keys](../../images/arangograph-my-api-keys.png) - -### How to create a new API key - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __My API keys__ in the __My account__ section. -3. Click the __New API key__ button. -4. Optionally limit the API key to a specific organization. -5. Optionally specify after how many hours the API key should expire into the - __Time to live__ field. -6. Optionally limit the API key to read-only APIs -7. Click the __Create__ button. -8. Copy the API key ID and Secret, then click the __Close__ button. - -{{< security >}} -The secret is only shown once at creation time. -You have to store it in a safe place. -{{< /security >}} - -![ArangoGraph New API key](../../images/arangograph-new-api-key.png) - -![ArangoGraph API key Secret](../../images/arangograph-api-key-secret.png) - -### How to revoke or delete an API key - -1. Hover over or click the user icon of the __User Toolbar__ in the top right corner. -2. Click __My API keys__ in the __My account__ section. -3. Click an icon in the __Actions__ column: - - __Counter-clockwise arrow__ icon: Revoke API key - - __Recycle bin__ icon: Delete API key -4. Click the __Yes__ button to confirm. - -{{% comment %}} -TODO: Copy to clipboard button -Access token that should expire after 1 hour unless renewed, might get removed as it's confusing. -{{% /comment %}} diff --git a/site/content/3.10/arangograph/notebooks.md b/site/content/3.10/arangograph/notebooks.md deleted file mode 100644 index b581dc44d8..0000000000 --- a/site/content/3.10/arangograph/notebooks.md +++ /dev/null @@ -1,170 +0,0 @@ ---- -title: ArangoGraph Notebooks -menuTitle: Notebooks -weight: 25 -description: >- - How to create and manage colocated Jupyter Notebooks within ArangoGraph ---- -{{< info >}} -This documentation describes the beta version of the Notebooks feature and is -subject to change. The beta version is free for all. -{{< /info >}} - -The ArangoGraph Notebook is a JupyterLab notebook embedded in the ArangoGraph -Insights Platform. The notebook integrates seamlessly with platform, -automatically connecting to ArangoGraph services, including ArangoDB and the -ArangoML platform services. This makes it much easier to leverage these -resources without having to download any data locally or to remember user IDs, -passwords, and endpoint URLs. - -![ArangoGraph Notebooks Architecture](../../images/arangograph-notebooks-architecture.png) - -The ArangoGraph Notebook has built-in [ArangoGraph Magic Commands](#arangograph-magic-commands) -that answer questions like: -- What ArangoDB database am I connected to at the moment? -- What data does the ArangoDB instance contain? -- How can I access certain documents? -- How do I create a graph? - -The ArangoGraph Notebook also pre-installs [python-arango](https://docs.python-arango.com/en/main/) -and ArangoML connectors -to [PyG](https://github.com/arangoml/pyg-adapter), -[DGL](https://github.com/arangoml/dgl-adapter), -[CuGraph](https://github.com/arangoml/cugraph-adapter), as well as the -[FastGraphML](https://github.com/arangoml/fastgraphml) -library, so you can get started -right away accessing data in ArangoDB to develop GraphML models using your -favorite GraphML libraries with GPUs. - -## How to create a new notebook - -1. Open the deployment in which you want to create the notebook. -2. Go to the **Data science** section and click the **Create Notebook** button. -3. Enter a name and optionally a description for your new notebook. -4. Select a configuration model from the dropdown menu. Click **Save**. -5. The notebook's phase is set to **Initializing**. Once the phase changes to - **Running**, the notebook's endpoint is accessible. -6. Click the **Open notebook** button to access your notebook. -7. To access your notebook, you need to be signed into ArangoGraph as a user with - the `notebook.notebook.execute` permission in your project. Organization - owners have this permission enabled by default. The `notebook-executor` role - which contains the permission can also be granted to other members of the - organization via roles. See how to create a - [role binding](security-and-access-control/_index.md#how-to-view-edit-or-remove-role-bindings-of-a-policy). - -{{< info >}} -Depending on the tier your organization belongs to, different limitations apply: -- On-Demand and Committed: you can create up to three notebooks per deployment. -- Free-to-try: you can only create one notebook per deployment. -{{< /info >}} - -![Notebooks](../../images/arangograph-notebooks.png) - -{{< info >}} -Notebooks in beta version have a fixed configuration of 10 GB of disk size. -{{< /info >}} - -## How to edit a notebook - -1. Select the notebook that you want to change from the **Notebooks** tab. -2. Click **Edit notebook**. You can modify its name and description. -3. To pause a notebook, click the **Pause notebook** button. You can resume it -at anytime. The notebook's phase is updated accordingly. - -## How to delete a notebook - -1. Select the notebook that you want to remove from the **Notebooks** tab. -2. Click the **Delete notebook** button. - -## Getting Started notebook - -To get a better understanding of how to interact with your ArangoDB database -cluster, use the ArangoGraph Getting Started template. -The ArangoGraph Notebook automatically connects to the ArangoDB service -endpoint, so you can immediately start interacting with it. - -1. Log in to the notebook you have created by using your deployment's root password. -2. Select the `GettingStarted.ipynb` template from the file browser. - -## ArangoGraph Magic Commands - -A list of the available magic commands you can interact with. -Single line commands have `%` prefix and multi-line commands have `%%` prefix. - -**Database Commands** - -- `%listDatabases` - lists the databases on the database server. -- `%whichDatabase` - returns the database name you are connected to. -- `%createDatabase databaseName` - creates a database. -- `%selectDatabase databaseName` - selects a database as the current database. -- `%useDatabase databasename` - uses a database as the current database; - alias for `%selectDatabase`. -- `%getDatabase databaseName` - gets a database. Used for assigning a database, - e.g. `studentDB` = `getDatabase student_database`. -- `%deleteDatabase databaseName` - deletes the database. - -**Graph Commands** - -- `%listGraphs` - lists the graphs defined in the currently selected database. -- `%whichGraph` - returns the graph name that is currently selected. -- `%createGraph graphName` - creates a named graph. -- `%selectGraph graphName` - selects the graph as the current graph. -- `%useGraph graphName` - uses the graph as the current graph; - alias for `%selectGraph`. -- `%getGraph graphName` - gets the graph for variable assignment, - e.g. `studentGraph` = `%getGraph student-graph`. -- `%deleteGraph graphName` - deletes a graph. - -**Collection Commands** - -- `%listCollections` - lists the collections on the selected current database. -- `%whichCollection` - returns the collection name that is currently selected. -- `%createCollection collectionName` - creates a collection. -- `%selectCollection collectionName` - selects a collection as the current collection. -- `%useCollection collectionName` - uses the collection as the current collection; - alias for `%selectCollection`. -- `%getCollection collectionName` - gets a collection for variable assignment, - e.g. `student` = `% getCollection Student`. -- `%createEdgeCollection` - creates an edge collection. -- `%createVertexCollection` - creates a vertex collection. -- `%createEdgeDefinition` - creates an edge definition. -- `%deleteCollection collectionName` - deletes the collection. -- `%truncateCollection collectionName` - truncates the collection. -- `%sampleCollection collectionName` - returns a random document from the collection. - If no collection is specified, then it uses the selected collection. - -**Document Commands** - -- `%insertDocument jsonDocument` - inserts the document into the currently selected collection. -- `%replaceDocument jsonDocument` - replaces the document in the currently selected collection. -- `%updateDocument jsonDocument` - updates the document in the currently selected collection. -- `%deleteDocument jsonDocument` - deletes the document from the currently selected collection. -- `%%importBulk jsonDocumentArray` - imports an array of documents into the currently selected collection. - -**AQL Commands** - -- `%aql single-line_aql_query` - executes a single line AQL query. -- `%%aqlm multi-line_aql_query` - executes a multi-line AQL query. - -**Variables** - -- `_endpoint` - the endpoint (URL) of the ArangoDB Server. -- `_system` - the system database used for creating, listing, and deleting databases. -- `_db` - the selected (current) database. To select a different database, use `%selectDatabase`. -- `_graph` - the selected (current) graph. To select a different graph, use `%selectGraph`. -- `_collection` - the selected (current) collection. To select a different collection, use `%selectCollection`. -- `_user` - the current user. - -You can use these variables directly, for example, `_db.collections()` to list -collections or `_system.databases` to list databases. - -You can also create your own variable assignments, such as: - -- `schoolDB` = `%getDatabase schoolDB` -- `school_graph` = `%getGraph school_graph` -- `student` = `%getCollection Student` - -**Reset environment** - -In the event that any of the above variables have been unintentionally changed, -you can revert all of them to the default state with `reset_environment()`. diff --git a/site/content/3.10/arangograph/organizations/_index.md b/site/content/3.10/arangograph/organizations/_index.md deleted file mode 100644 index 85ee2c7656..0000000000 --- a/site/content/3.10/arangograph/organizations/_index.md +++ /dev/null @@ -1,111 +0,0 @@ ---- -title: Organizations in ArangoGraph -menuTitle: Organizations -weight: 10 -description: >- - How to manage organizations and what type of packages ArangoGraph offers ---- -An ArangoGraph organizations is a container for projects. An organization -typically represents a (commercial) entity such as a company, a company division, -an institution, or a non-profit organization. - -**Organizations → Projects → Deployments** - -Users can be members of one or more organizations. However, you can only be a -member of one _Free-to-try_ tier organization at a time. - -## How to switch between my organizations - -1. The first entry in the main navigation (with a double arrow icon) indicates - the current organization. -2. Click it to bring up a dropdown menu to select another organization of which you - are a member. -3. The overview will open for the selected organization, showing the number of - projects, the tier and when it was created. - -![ArangoGraph Organization Switcher](../../../images/arangograph-organization-switcher.png) - -![ArangoGraph Organization Overview](../../../images/arangograph-organization-overview.png) - -## ArangoGraph Packages - -With the ArangoGraph Insights Platform, your organization can choose one of the -following packages. - -### Free Trial - -ArangoGraph comes with a free-to-try tier that lets you test ArangoGraph for -free for 14 days. You can get started quickly, without needing to enter a -credit card. - -The free trial gives you access to: -- One small deployment (4GB) in a region of your choice for 14 days -- Local backups -- One ArangoGraph Notebook for learning and data science - -After the trial period, your deployment will be deleted automatically. - -### On-Demand - -Add a payment payment method to gain access to ArangoGraph's full feature set. -Pay monthly via a credit card for what you actually use. - -This package unlocks all ArangoGraph functionality, including: -- Multiple and larger deployments -- Backups to cloud storage, with multi-region support -- Enhanced security features such as Private Endpoints - -### Committed - -Commit up-front for a year and pay via the Sales team. This package provides -the same flexibility of On-Demand, but at a lower price. - -In addition, you gain access to: -- 24/7 Premium Support -- ArangoDB Professional Services Engagements -- Ability to transact via the AWS and GCP marketplaces - -To take advantage of this, you need to get in touch with the ArangoDB -team. [Contact us](https://www.arangodb.com/contact/) for more details. - -## How to unlock all features - -You can unlock all features in ArangoGraph at any time by adding your billing -details and a payment method. As soon as you have added a payment method, all -ArangoGraph functionalities are immediately unlocked. From that point on, your -deployments will no longer expire and you can create more and larger deployments. - -See [Billing: How to add billing details / payment methods](billing.md) - -![ArangoGraph Billing](../../../images/arangograph-billing.png) - -## How to create a new organization - -See [My Account: How to create a new organization](../my-account.md#how-to-create-a-new-organization) - -## How to restrict access to an organization - -If you want to restrict access to an organization, you can do it by specifying which authentication providers are accepted for users trying to access the organization. For more information, refer to the [Access Control](../security-and-access-control/_index.md#restricting-access-to-organizations) section. - -## How to delete the current organization - -{{< danger >}} -Removing an organization implies the deletion of projects and deployments. -This operation cannot be undone and **all deployment data will be lost**. -Please proceed with caution. -{{< /danger >}} - -1. Click **Overview** in the **Organization** section of the main navigation. -2. Open the **Danger zone** tab. -3. Click the **Delete organization** button. -4. Enter `Delete!` to confirm and click **Yes**. - -{{< info >}} -If you are no longer a member of any organization, then a new organization is -created for you when you log in again. -{{< /info >}} - -{{< tip >}} -If the organization has a locked resource (a project or a deployment), you need to [unlock](../security-and-access-control/_index.md#locked-resources) -that resource first to be able to delete the organization. -{{< /tip >}} diff --git a/site/content/3.10/arangograph/organizations/billing.md b/site/content/3.10/arangograph/organizations/billing.md deleted file mode 100644 index 9b892b5500..0000000000 --- a/site/content/3.10/arangograph/organizations/billing.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -title: Billing in ArangoGraph -menuTitle: Billing -weight: 10 -description: >- - How to manage billing details and payment methods in ArangoGraph ---- -## How to add billing details - -1. In the main navigation menu, click the **Organization** icon. -2. Click **Billing** in the **Organization** section. -3. In the **Billing Details** section, click **Edit**. -4. Enter your company name, billing address, and EU VAT identification number (if applicable). -5. Optionally, enter the email address(es) to which invoices should be emailed - to automatically. -6. Click **Save**. - -![ArangoGraph Billing Details](../../../images/arangograph-billing-details.png) - -## How to add a payment method - -1. In the main navigation menu, click the **Organization** icon. -2. Click **Billing** in the **Organization** section. -3. In the **Payment methods** section, click **Add**. -4. Fill out the form with your credit card details. Currently, a credit card is the only available payment method. -5. Click **Save**. - -![ArangoGraph Payment Method](../../../images/arangograph-add-payment-method-credit-card.png) - -{{% comment %}} -TODO: Need screenshot with invoice - -### How to view invoices - - -{{% /comment %}} diff --git a/site/content/3.10/arangograph/organizations/credits-and-usage.md b/site/content/3.10/arangograph/organizations/credits-and-usage.md deleted file mode 100644 index 34dafb8488..0000000000 --- a/site/content/3.10/arangograph/organizations/credits-and-usage.md +++ /dev/null @@ -1,147 +0,0 @@ ---- -title: Credits & Usage in ArangoGraph -menuTitle: Credits & Usage -weight: 15 -description: >- - Credits give you access to a flexible prepaid model, so you can allocate them - across multiple deployments as needed ---- -{{< info >}} -Credits are only available if your organization has signed up for -ArangoGraph's [Committed](../organizations/_index.md#committed) package. -{{< /info >}} - -The ArangoGraph credit model is a versatile prepaid model that allows you to -purchase credits and use them in a flexible way, based on what you have running -in ArangoGraph. - -Instead of purchasing a particular deployment for a year, you can purchase a -number of ArangoGraph credits that expire a year after purchase. These credits -are then consumed over that time period, based on the deployments you run -in ArangoGraph. - -For example, a OneShard (three nodes) A64 deployment consumes more credits per -hour than a smaller deployment such as A8. If you are running multiple deployments, -like pre-production environments or for different use-cases, these would each consume -from the same credit balance. However, if you are not running any deployments -and do not have any backup storage, then none of your credits will be consumed. - -{{< tip >}} -To purchase credits for your organization, you need to get in touch with the -ArangoDB team. [Contact us](https://www.arangodb.com/contact/) for more details. -{{< /tip >}} - -There are a number of benefits that ArangoGraph credits provide: -- **Adaptability**: The pre-paid credit model allows you to adapt your usage to - changing project requirements or fluctuating workloads. By enabling the use of - credits for various instance types and sizes, you can easily adjust your - resource allocation. -- **Efficient handling of resources**: With the ability to purchase credits in - advance, you can better align your needs in terms of resources and costs. - You can purchase credits in bulk and then allocate them as needed. -- **Workload Optimization**: By having a clear view of credit consumption and - remaining balance, you can identify inefficiencies to further optimize your - infrastructure, resulting in cost savings and better performance. - -## How to view the credit usage - -1. In the main navigation, click the **Organization** icon. -2. Click **Credits & Usage** in the **Organization** section. -3. In the **Credits & Usage** page, you can: - - See the remaining credit balance. - - Track your total credit balance. - - See a projection of when you will run out of credits, based on the last 30 days of usage. - - Get a detailed consumption report in PDF format that shows: - - The number of credits you had at the start of the month. - - The number of credits consumed in the month. - - The number of credits remaining. - - The number of credits consumed for each deployment. - -![ArangoGraph Credits and Usage](../../../images/arangograph-credits-and-usage.png) - -## FAQs - -### Are there any configuration constraints for using the credits? - -No. Credits are designed to be used completely flexibly. You can use all of your -credits for multiple small deployments (i.e. A8s) or you can use them for a single -large deployment (i.e. A256), or even multiple large deployments, as long as you -have enough credits remaining. - -### What is the flexibility of moving up or down in configuration size of the infrastructure? - -You can move up sizes in configuration at any point by editing your deployment -within ArangoGraph, once every 6 hours to allow for in-place disk expansion. - -### Is there a limit to how many deployments I can use my credits on? - -There is no specific limit to the number of deployments you can use your credits -on. The credit model is designed to provide you with the flexibility to allocate -credits across multiple deployments as needed. This enables you to effectively -manage and distribute your resources according to your specific requirements and -priorities. However, it is essential to monitor your credit consumption to ensure -that you have sufficient credits to cover your deployments. - -### Do the credits I purchase expire? - -Yes, credits expire 1 year after purchase. You should ensure that you consume -all of these credits within the year. - -### Can I make multiple purchases of credits within a year? - -As an organization’s usage of ArangoGraph grows, particularly in the initial -phases of application development and early production release, it is common -to purchase a smaller credit package that is later supplemented by a larger -credit package part-way through the initial credit expiry term. -In this case, all sets of credits will be available for ArangoGraph consumption -as a single credit balance. The credits with the earlier expiry date are consumed -first to avoid credit expiry where possible. - -### Can I purchase a specific number of credits (i.e. 3361, 4185)? - -ArangoGraph offers a variety of predefined credit packages designed to -accommodate different needs and stages of the application lifecycle. -For any credit purchasing needs, please [contact us](https://www.arangodb.com/contact/) -and we are happy to help find an appropriate package for you. - -### How quickly will the credits I purchase be consumed? - -The rate at which your purchased credits will be consumed depends on several -factors, including the type and size of instances you deploy, the amount of -resources used, and the duration of usage. Each machine size has an hourly credit -consumption rate, and the overall rate of credit consumption will increase for -larger sizes or for more machines/deployments. Credits will also be consumed for -any variable usage charges such as outbound network traffic and backup storage. - -### How can I see how many credits I have remaining? - -All details about credits, including how many credits have been purchased, -how many remain, and how they are being consumed are available in the -**Credits & Usage** page within the ArangoGraph web interface. - -### I have a large sharded deployment, how do I know how many credits it will consume? - -If you are using credits, then you will be able to see how many credits your -configured deployment will consume when [creating](../deployments/_index.md#how-to-create-a-new-deployment) -or [editing a deployment](../deployments/_index.md#how-to-edit-a-deployment). - -You can download a detailed consumption report in the -[**Credits & Usage** section](#how-to-view-the-credit-usage). It shows you the -number of credits consumed by any deployment you are creating or editing. - -All users can see the credit price of each node size in the **Pricing** section. - -### What happens if I run out of credits? - -If you run out of credits, your access to ArangoGraph's services and resources -will be temporarily suspended until you purchase additional credits. - -### Can I buy credits for a short time period (e.g. 2 months)? - -No, you cannot but credits with an expiry of less than 12 months. -If you require credits for a shorter time frame, such as 2 months, you can still -purchase one of the standard credit packages and consume the credits as needed -during that time. You may opt for a smaller credit package that aligns with your -expected usage during the desired period, rather than the full year’s expected usage. -Although the credits will have a longer expiration period, this allows you to have -the flexibility of utilizing the remaining credits for any future needs. \ No newline at end of file diff --git a/site/content/3.10/arangograph/organizations/users-and-groups.md b/site/content/3.10/arangograph/organizations/users-and-groups.md deleted file mode 100644 index abed36697b..0000000000 --- a/site/content/3.10/arangograph/organizations/users-and-groups.md +++ /dev/null @@ -1,125 +0,0 @@ ---- -title: Users and Groups in ArangoGraph -menuTitle: Users & Groups -weight: 5 -description: >- - How to manage individual members and user groups in ArangoGraph ---- -## Users, groups & members - -When you use ArangoGraph, you are logged in as a user. -A user has properties such as name & email address. -Most important of the user is that it serves as an identity of a person. - -A user is member of one or more organizations in ArangoGraph. -You can become a member of an organization in the following ways: - -- Create a new organization. You will become the first member and owner of that - organization. -- Be invited to join an organization. Once accepted (by the invited user), this - user becomes a member of the organization. - -If the number of members of an organization becomes large, it helps to group -users. In ArangoGraph a group is part of an organization and a group contains -a list of users. All users of the group must be member of the owning organization. - -In the **People** section of the dashboard you can manage users, groups and -invites for the organization. - -To edit permissions of members see [Access Control](../security-and-access-control/_index.md). - -## Members - -Members are a list of users that can access an organization. - -![ArangoGraph Member Access Control](../../../images/arangograph-access-control-members.png) - -### How to add a new member to the organization - -1. In the main navigation, click the __Organization__ icon. -2. Click __Members__ in the __People__ section. -3. Optionally, click the __Invites__ entry. -4. Click the __Invite new member__ button. -5. In the form that appears, enter the email address of the person you want to - invite. -6. Click the __Create__ button. -7. An email with an organization invite will now be sent to the specified - email address. -8. After accepting the invite the person will be added to the organization - [members](#members). - -![ArangoGraph Organization Invites](../../../images/arangograph-new-invite.png) - -### How to respond to an organization invite - -See [My Account: How to respond to my invites](../my-account.md#how-to-respond-to-my-invites) - -### How to remove a member from the organization - -1. Click __Members__ in the __People__ section of the main navigation. -2. Delete a member by pressing the __recycle bin__ icon in the __Actions__ column. -3. Confirm the deletion in the dialog that pops up. - -{{< info >}} -You cannot delete members who are organization owners. -{{< /info >}} - -### How to make a member an organization owner - -1. Click __Members__ in the __People__ section of the main navigation. -2. You can convert a member to an organization owner by pressing the __Key__ icon - in the __Actions__ column. -3. You can convert a member back to a normal user by pressing the __User__ icon - in the __Actions__ column. - -## Groups - -A group is a defined set of members. Groups can then be bound to roles. These -bindings contribute to the respective organization, project or deployment policy. - -![ArangoGraph Groups](../../../images/arangograph-groups.png) - -### How to create a new group - -1. Click __Groups__ in the __People__ section of the main navigation. -2. Press the __New group__ button. -3. Enter a name and optionally a description for your new group. -4. Select the members you want to be part of the group. -5. Press the __Create__ button. - -![ArangoGraph New Group](../../../images/arangograph-new-group.png) - -### How to view, edit or remove a group - -1. Click __Groups__ in the __People__ section of the main navigation. -2. Click an icon in the __Actions__ column: - - __Eye__: View group - - __Pencil__: Edit group - - __Recycle bin__: Delete group - -You can also click a group name to view it. There are buttons to __Edit__ and -__Delete__ the currently viewed group. - -![ArangoGraph Group](../../../images/arangograph-group.png) - -{{< info >}} -The groups __Organization members__ and __Organization owners__ are virtual groups -and cannot be changed. They always reflect the current set of organization -members and owners. -{{< /info >}} - -## Invites - -### How to create a new organization invite - -See [How to add a new member to the organization](#how-to-add-a-new-member-to-the-organization) - -### How to view the status of invitations - -1. Click __Invites__ in the __People__ section of the main navigation. -2. The created invites are displayed, grouped by status __Pending__, - __Accepted__ and __Rejected__. -3. You may delete pending invites by clicking the __recycle bin__ icon in the - __Actions__ column. - -![ArangoGraph Organization Invites](../../../images/arangograph-org-invites.png) diff --git a/site/content/3.10/arangograph/projects.md b/site/content/3.10/arangograph/projects.md deleted file mode 100644 index f4efd27833..0000000000 --- a/site/content/3.10/arangograph/projects.md +++ /dev/null @@ -1,82 +0,0 @@ ---- -title: Projects in ArangoGraph -menuTitle: Projects -weight: 15 -description: >- - How to manage projects and IP allowlists in ArangoGraph ---- -ArangoGraph projects can represent organizational units such as teams, -product groups, environments (e.g. staging vs. production). You can have any -number of projects under one organization. - -**Organizations → Projects → Deployments** - -Projects are a container for related deployments, certificates & IP allowlists. -Projects also come with their own policy for access control. You can have any -number of deployment under one project. - -![ArangoGraph Projects Overview](../../images/arangograph-projects-overview.png) - -## How to create a new project - -1. In the main navigation, click the __Dashboard__ icon. -2. Click __Projects__ in the __Dashboard__ section. -3. Click the __New project__ button. -4. Enter a name and optionally a description for your new project. -5. Click the __Create__ button. -6. You will be taken to the project page. -7. To change the name or description, click either at the top of the page. - -![ArangoGraph New Project](../../images/arangograph-new-project.png) - -![ArangoGraph Project Summary](../../images/arangograph-project.png) - -{{< info >}} -Projects contain exactly **one policy**. Within that policy, you can define -role bindings to regulate access control on a project level. -{{< /info >}} - -## How to create a new deployment - -See [Deployments: How to create a new deployment](deployments/_index.md#how-to-create-a-new-deployment) - -## How to delete a project - -{{< danger >}} -Deleting a project will delete contained deployments, certificates & IP allowlists. -This operation is **irreversible**. -{{< /danger >}} - -1. Click __Projects__ in the __Dashboard__ section of the main navigation. -2. Click the __recycle bin__ icon in the __Actions__ column of the project to be deleted. -3. Enter `Delete!` to confirm and click __Yes__. - -{{< tip >}} -If the project has a locked deployment, you need to [unlock](security-and-access-control/_index.md#locked-resources) -it first to be able to delete the project. -{{< /tip >}} - -## How to manage IP allowlists - -IP allowlists let you limit access to your deployment to certain IP ranges. -It is optional, but strongly recommended to do so. - -You can create an allowlist as part of a project. - -1. Click a project name in the __Projects__ section of the main navigation. -2. Click the __Security__ entry. -3. In the __IP allowlists__ section, click: - - The __New IP allowlist__ button to create a new allowlist. - When creating or editing a list, you can add comments - in the __Allowed CIDR ranges (1 per line)__ section. - Everything after `//` or `#` is considered a comment until the end of the line. - - A name or the __eye__ icon in the __Actions__ column to view the allowlist. - - The __pencil__ icon to edit the allowlist. - You can also view the allowlist and click the __Edit__ button. - - The __recycle bin__ icon to delete the allowlist. - -## How to manage role bindings - -See: -- [Access Control: How to view, edit or remove role bindings of a policy](security-and-access-control/_index.md#how-to-view-edit-or-remove-role-bindings-of-a-policy) -- [Access Control: How to add a role binding to a policy](security-and-access-control/_index.md#how-to-add-a-role-binding-to-a-policy) diff --git a/site/content/3.10/arangograph/security-and-access-control/_index.md b/site/content/3.10/arangograph/security-and-access-control/_index.md deleted file mode 100644 index 27742b57b3..0000000000 --- a/site/content/3.10/arangograph/security-and-access-control/_index.md +++ /dev/null @@ -1,698 +0,0 @@ ---- -title: Security and access control in ArangoGraph -menuTitle: Security and Access Control -weight: 45 -description: >- - This guide explains which access control concepts are available in - ArangoGraph and how to use them ---- -The ArangoGraph Insights Platform has a structured set of resources that are subject to security and -access control: - -- Organizations -- Projects -- Deployments - -For each of these resources, you can perform various operations. -For example, you can create a project in an organization and create a deployment -inside a project. - -## Locked resources - -In ArangoGraph, you can lock the resources to prevent accidental deletion. When -a resource is locked, it cannot be deleted and must be unlocked first. - -The hierarchical structure of the resources (organization-project-deployment) -is used in the locking functionality: if a child resource is locked -(for example, a deployment), you cannot delete the parent project without -unlocking that deployment first. - -{{< info >}} -If you lock a backup policy of a deployment or an IP allowlist, CA certificate, -and IAM provider of a project, it is still possible to delete -the corresponding parent resource without unlocking those properties first. -{{< /info >}} - -## Policy - -Various actions in ArangoGraph require different permissions, which can be -granted to users via **roles**. - -The association of a member with a role is called a **role binding**. -All role bindings of a resource comprise a **policy**. - -Roles can be bound on an organization, project, and deployment level (listed in -the high to low level order, with lower levels inheriting permissions from their -parents). This means that there is a unique policy per resource (an organization, -a project, or a deployment). - -For example, an organization has exactly one policy, -which binds roles to members of the organization. These bindings are used to -give the users permissions to perform operations in this organization. -This is useful when, as an organization owner, you need to extend the permissions -for an organization member. - -{{< info >}} -Permissions linked to predefined roles vary between organization owners and -organization members. If you need to extend permissions for an organization -member, you can create a new role binding. The complete list of roles and -their respective permissions for both organization owners and members can be -viewed on the **Policy** page of an organization within the ArangoGraph dashboard. -{{< /info >}} - -### How to view, edit, or remove role bindings of a policy - -Decide whether you want to edit the policy for an organization, a project, -or a deployment: - -- **Organization**: In the main navigation, click the __Organization__ icon and - then click __Policy__. -- **Project**: In the main navigation, click the __Dashboard__ icon, then click - __Projects__, click the name of the desired project, and finally click __Policy__. -- **Deployment**: In the main navigation, click the __Dashboard__ icon, then - click __Deployments__, click the name of the desired deployment, and finally - click __Policy__. - -To delete a role binding, click the **Recycle Bin** icon in the **Actions** column. - -{{< info >}} -Currently, you cannot edit a role binding, you can only delete it. -{{< /info >}} - -![ArangoGraph Project Policy](../../../images/arangograph-policy-page.png) - -### How to add a role binding to a policy - -1. Navigate to the **Policy** tab of an organization, a project or a deployment. -2. Click the **New role binding** button. -3. Select one or more users and/or groups. -4. Select one or more roles you want to assign to the specified members. -5. Click **Create**. - -![ArangoGraph New Role Binding](../../../images/arangograph-new-policy-role-binding.png) - -## Roles - -Operations on resources in ArangoGraph require zero (just an authentication) or -more permissions. Since the -number of permissions is large and very detailed, it is not practical to assign -permissions directly to users. Instead, ArangoGraph uses **roles**. - -A role is a set of permissions. Roles can be bound to groups (preferably) -or individual users. You can create such bindings for the respective organization, -project, or deployment policy. - -There are predefined roles, but you can also create custom ones. - -![ArangoGraph Roles](../../../images/arangograph-access-control-roles.png) - -### Predefined roles - -Predefined roles are created by ArangoGraph and group related permissions together. -An example of a predefined role is `deployment-viewer`. This role -contains all permissions needed to view deployments in a project. - -Predefined roles cannot be deleted. Note that permissions linked to predefined -roles vary between organization owners and organization members. - -{{% comment %}} -Command to generate below list with (Git)Bash: - -export OASIS_TOKEN='' -./oasisctl list roles --organization-id --format json | jq -r '.[] | select(.predefined == true) | "**\(.description)** (`\(.id)`):\n\(.permissions | split(", ") | map("- `\(.)`\n") | join(""))"' -{{% /comment %}} - -{{< details summary="List of predefined roles and their permissions" >}} - -{{}} -The roles below are described following this pattern: - -**Role description** (`role ID`): -- `Permission` -{{}} - -**Audit Log Admin** (`auditlog-admin`): -- `audit.auditlog.create` -- `audit.auditlog.delete` -- `audit.auditlog.get` -- `audit.auditlog.list` -- `audit.auditlog.set-default` -- `audit.auditlog.test-https-post-destination` -- `audit.auditlog.update` - -**Audit Log Archive Admin** (`auditlog-archive-admin`): -- `audit.auditlogarchive.delete` -- `audit.auditlogarchive.get` -- `audit.auditlogarchive.list` - -**Audit Log Archive Viewer** (`auditlog-archive-viewer`): -- `audit.auditlogarchive.get` -- `audit.auditlogarchive.list` - -**Audit Log Attachment Admin** (`auditlog-attachment-admin`): -- `audit.auditlogattachment.create` -- `audit.auditlogattachment.delete` -- `audit.auditlogattachment.get` - -**Audit Log Attachment Viewer** (`auditlog-attachment-viewer`): -- `audit.auditlogattachment.get` - -**Audit Log Event Admin** (`auditlog-event-admin`): -- `audit.auditlogevent.delete` -- `audit.auditlogevents.get` - -**Audit Log Event Viewer** (`auditlog-event-viewer`): -- `audit.auditlogevents.get` - -**Audit Log Viewer** (`auditlog-viewer`): -- `audit.auditlog.get` -- `audit.auditlog.list` - -**Backup Administrator** (`backup-admin`): -- `backup.backup.copy` -- `backup.backup.create` -- `backup.backup.delete` -- `backup.backup.download` -- `backup.backup.get` -- `backup.backup.list` -- `backup.backup.restore` -- `backup.backup.update` -- `backup.feature.get` -- `data.deployment.restore-backup` - -**Backup Viewer** (`backup-viewer`): -- `backup.backup.get` -- `backup.backup.list` -- `backup.feature.get` - -**Backup Policy Administrator** (`backuppolicy-admin`): -- `backup.backuppolicy.create` -- `backup.backuppolicy.delete` -- `backup.backuppolicy.get` -- `backup.backuppolicy.list` -- `backup.backuppolicy.update` -- `backup.feature.get` - -**Backup Policy Viewer** (`backuppolicy-viewer`): -- `backup.backuppolicy.get` -- `backup.backuppolicy.list` -- `backup.feature.get` - -**Billing Administrator** (`billing-admin`): -- `billing.config.get` -- `billing.config.set` -- `billing.invoice.get` -- `billing.invoice.get-preliminary` -- `billing.invoice.get-statistics` -- `billing.invoice.list` -- `billing.organization.get` -- `billing.paymentmethod.create` -- `billing.paymentmethod.delete` -- `billing.paymentmethod.get` -- `billing.paymentmethod.get-default` -- `billing.paymentmethod.list` -- `billing.paymentmethod.set-default` -- `billing.paymentmethod.update` -- `billing.paymentprovider.list` - -**Billing Viewer** (`billing-viewer`): -- `billing.config.get` -- `billing.invoice.get` -- `billing.invoice.get-preliminary` -- `billing.invoice.get-statistics` -- `billing.invoice.list` -- `billing.organization.get` -- `billing.paymentmethod.get` -- `billing.paymentmethod.get-default` -- `billing.paymentmethod.list` -- `billing.paymentprovider.list` - -**CA Certificate Administrator** (`cacertificate-admin`): -- `crypto.cacertificate.create` -- `crypto.cacertificate.delete` -- `crypto.cacertificate.get` -- `crypto.cacertificate.list` -- `crypto.cacertificate.set-default` -- `crypto.cacertificate.update` - -**CA Certificate Viewer** (`cacertificate-viewer`): -- `crypto.cacertificate.get` -- `crypto.cacertificate.list` - -**Dataloader Administrator** (`dataloader-admin`): -- `dataloader.deployment.import` - -**Deployment Administrator** (`deployment-admin`): -- `data.cpusize.list` -- `data.deployment.create` -- `data.deployment.create-test-database` -- `data.deployment.delete` -- `data.deployment.get` -- `data.deployment.list` -- `data.deployment.pause` -- `data.deployment.rebalance-shards` -- `data.deployment.resume` -- `data.deployment.rotate-server` -- `data.deployment.update` -- `data.deployment.update-scheduled-root-password-rotation` -- `data.deploymentfeatures.get` -- `data.deploymentmodel.list` -- `data.deploymentprice.calculate` -- `data.diskperformance.list` -- `data.limits.get` -- `data.nodesize.list` -- `data.presets.list` -- `monitoring.logs.get` -- `monitoring.metrics.get` -- `notification.deployment-notification.list` -- `notification.deployment-notification.mark-as-read` -- `notification.deployment-notification.mark-as-unread` - -**Deployment Content Administrator** (`deployment-content-admin`): -- `data.cpusize.list` -- `data.deployment.create-test-database` -- `data.deployment.get` -- `data.deployment.list` -- `data.deploymentcredentials.get` -- `data.deploymentfeatures.get` -- `data.deploymentmodel.list` -- `data.deploymentprice.calculate` -- `data.diskperformance.list` -- `data.limits.get` -- `data.nodesize.list` -- `data.presets.list` -- `monitoring.logs.get` -- `monitoring.metrics.get` -- `notification.deployment-notification.list` -- `notification.deployment-notification.mark-as-read` -- `notification.deployment-notification.mark-as-unread` - -**Deployment Full Access User** (`deployment-full-access-user`): -- `data.deployment.full-access` - -**Deployment Read Only User** (`deployment-read-only-user`): -- `data.deployment.read-only-access` - -**Deployment Viewer** (`deployment-viewer`): -- `data.cpusize.list` -- `data.deployment.get` -- `data.deployment.list` -- `data.deploymentfeatures.get` -- `data.deploymentmodel.list` -- `data.deploymentprice.calculate` -- `data.diskperformance.list` -- `data.limits.get` -- `data.nodesize.list` -- `data.presets.list` -- `monitoring.metrics.get` -- `notification.deployment-notification.list` -- `notification.deployment-notification.mark-as-read` -- `notification.deployment-notification.mark-as-unread` - -**Deployment Profile Viewer** (`deploymentprofile-viewer`): -- `deploymentprofile.deploymentprofile.list` - -**Example Datasets Viewer** (`exampledataset-viewer`): -- `example.exampledataset.get` -- `example.exampledataset.list` - -**Example Dataset Installation Administrator** (`exampledatasetinstallation-admin`): -- `example.exampledatasetinstallation.create` -- `example.exampledatasetinstallation.delete` -- `example.exampledatasetinstallation.get` -- `example.exampledatasetinstallation.list` -- `example.exampledatasetinstallation.update` - -**Example Dataset Installation Viewer** (`exampledatasetinstallation-viewer`): -- `example.exampledatasetinstallation.get` -- `example.exampledatasetinstallation.list` - -**Group Administrator** (`group-admin`): -- `iam.group.create` -- `iam.group.delete` -- `iam.group.get` -- `iam.group.list` -- `iam.group.update` - -**Group Viewer** (`group-viewer`): -- `iam.group.get` -- `iam.group.list` - -**IAM provider Administrator** (`iamprovider-admin`): -- `security.iamprovider.create` -- `security.iamprovider.delete` -- `security.iamprovider.get` -- `security.iamprovider.list` -- `security.iamprovider.set-default` -- `security.iamprovider.update` - -**IAM provider Viewer** (`iamprovider-viewer`): -- `security.iamprovider.get` -- `security.iamprovider.list` - -**IP allowlist Administrator** (`ipwhitelist-admin`): -- `security.ipallowlist.create` -- `security.ipallowlist.delete` -- `security.ipallowlist.get` -- `security.ipallowlist.list` -- `security.ipallowlist.update` - -**IP allowlist Viewer** (`ipwhitelist-viewer`): -- `security.ipallowlist.get` -- `security.ipallowlist.list` - -**Metrics Administrator** (`metrics-admin`): -- `metrics.endpoint.get` -- `metrics.token.create` -- `metrics.token.delete` -- `metrics.token.get` -- `metrics.token.list` -- `metrics.token.revoke` -- `metrics.token.update` - -**Migration Administrator** (`migration-admin`): -- `replication.deploymentmigration.create` -- `replication.deploymentmigration.delete` -- `replication.deploymentmigration.get` - -**MLServices Admin** (`mlservices-admin`): -- `ml.mlservices.get` - -**Notebook Administrator** (`notebook-admin`): -- `notebook.model.list` -- `notebook.notebook.create` -- `notebook.notebook.delete` -- `notebook.notebook.get` -- `notebook.notebook.list` -- `notebook.notebook.pause` -- `notebook.notebook.resume` -- `notebook.notebook.update` - -**Notebook Executor** (`notebook-executor`): -- `notebook.notebook.execute` - -**Notebook Viewer** (`notebook-viewer`): -- `notebook.model.list` -- `notebook.notebook.get` -- `notebook.notebook.list` - -**Organization Administrator** (`organization-admin`): -- `billing.organization.get` -- `resourcemanager.organization-invite.create` -- `resourcemanager.organization-invite.delete` -- `resourcemanager.organization-invite.get` -- `resourcemanager.organization-invite.list` -- `resourcemanager.organization-invite.update` -- `resourcemanager.organization.delete` -- `resourcemanager.organization.get` -- `resourcemanager.organization.update` - -**Organization Viewer** (`organization-viewer`): -- `billing.organization.get` -- `resourcemanager.organization-invite.get` -- `resourcemanager.organization-invite.list` -- `resourcemanager.organization.get` - -**Policy Administrator** (`policy-admin`): -- `iam.policy.get` -- `iam.policy.update` - -**Policy Viewer** (`policy-viewer`): -- `iam.policy.get` - -**Prepaid Deployment Viewer** (`prepaid-deployment-viewer`): -- `prepaid.prepaiddeployment.get` -- `prepaid.prepaiddeployment.list` - -**Private Endpoint Service Administrator** (`privateendpointservice-admin`): -- `network.privateendpointservice.create` -- `network.privateendpointservice.get` -- `network.privateendpointservice.get-by-deployment-id` -- `network.privateendpointservice.get-feature` -- `network.privateendpointservice.update` - -**Private Endpoint Service Viewer** (`privateendpointservice-viewer`): -- `network.privateendpointservice.get` -- `network.privateendpointservice.get-by-deployment-id` -- `network.privateendpointservice.get-feature` - -**Project Administrator** (`project-admin`): -- `resourcemanager.project.create` -- `resourcemanager.project.delete` -- `resourcemanager.project.get` -- `resourcemanager.project.list` -- `resourcemanager.project.update` - -**Project Viewer** (`project-viewer`): -- `resourcemanager.project.get` -- `resourcemanager.project.list` - -**Replication Administrator** (`replication-admin`): -- `replication.deployment.clone-from-backup` -- `replication.deploymentreplication.get` -- `replication.deploymentreplication.update` -- `replication.migration-forwarder.upgrade-connection` - -**Role Administrator** (`role-admin`): -- `iam.role.create` -- `iam.role.delete` -- `iam.role.get` -- `iam.role.list` -- `iam.role.update` - -**Role Viewer** (`role-viewer`): -- `iam.role.get` -- `iam.role.list` - -**SCIM Administrator** (`scim-admin`): -- `scim.user.add` -- `scim.user.delete` -- `scim.user.get` -- `scim.user.list` -- `scim.user.update` - -**User Administrator** (`user-admin`): -- `iam.user.get-personal-data` -- `iam.user.update` - -{{< /details >}} - -### How to create a custom role - -1. In the main navigation menu, click **Access Control**. -2. On the **Roles** tab, click **New role**. -3. Enter a name and optionally a description for the new role. -4. Select the required permissions. -5. Click **Create**. - -![ArangoGraph New Role](../../../images/arangograph-create-role.png) - -### How to view, edit or remove a custom role - -1. In the main navigation menu, click **Access Control**. -2. On the **Roles** tab, click: - - A role name or the **eye** icon in the **Actions** column to view the role. - - The **pencil** icon in the **Actions** column to edit the role. - You can also view a role and click the **Edit** button in the detail view. - - The **recycle bin** icon to delete the role. - You can also view a role and click the **Delete** button in the detail view. - -## Permissions - -Each operation done on a resource requires zero (just authentication) or more **permissions**. -A permission is a constant string such as `resourcemanager.project.create`, -following this schema: `..`. - -Permissions are solely defined by the ArangoGraph API. - -{{% comment %}} -Retrieved with the below command, with manual adjustments: -oasisctl list permissions - -Note that if the tier is "internal", there is an `internal-dashboard` API that should be excluded in below list! -{{% /comment %}} - -| API | Kind | Verbs -|:--------------------|:-----------------------------|:------------------------------------------- -| `audit` | `auditlogarchive` | `delete`, `get`, `list` -| `audit` | `auditlogattachment` | `create`, `delete`, `get` -| `audit` | `auditlogevents` | `get` -| `audit` | `auditlogevent` | `delete` -| `audit` | `auditlog` | `create`, `delete`, `get`, `list`, `set-default`, `test-https-post-destination`, `update` -| `backup` | `backuppolicy` | `create`, `delete`, `get`, `list`, `update` -| `backup` | `backup` | `copy`, `create`, `delete`, `download`, `get`, `list`, `restore`, `update` -| `backup` | `feature` | `get` -| `billing` | `config` | `get`, `set` -| `billing` | `invoice` | `get`, `get-preliminary`, `get-statistics`, `list` -| `billing` | `organization` | `get` -| `billing` | `paymentmethod` | `create`, `delete`, `get`, `get-default`, `list`, `set-default`, `update` -| `billing` | `paymentprovider` | `list` -| `crypto` | `cacertificate` | `create`, `delete`, `get`, `list`, `set-default`, `update` -| `dataloader` | `deployment` | `import` -| `data` | `cpusize` | `list` -| `data` | `deploymentcredentials` | `get` -| `data` | `deploymentfeatures` | `get` -| `data` | `deploymentmodel` | `list` -| `data` | `deploymentprice` | `calculate` -| `data` | `deployment` | `create`, `create-test-database`, `delete`, `full-access`, `get`, `list`, `pause`, `read-only-access`, `rebalance-shards`, `restore-backup`, `resume`, `rotate-server`, `update`, `update-scheduled-root-password-rotation` -| `data` | `diskperformance` | `list` -| `data` | `limits` | `get` -| `data` | `nodesize` | `list` -| `data` | `presets` | `list` -| `deploymentprofile` | `deploymentprofile` | `list` -| `example` | `exampledatasetinstallation` | `create`, `delete`, `get`, `list`, `update` -| `example` | `exampledataset` | `get`, `list` -| `iam` | `group` | `create`, `delete`, `get`, `list`, `update` -| `iam` | `policy` | `get`, `update` -| `iam` | `role` | `create`, `delete`, `get`, `list`, `update` -| `iam` | `user` | `get-personal-data`, `update` -| `metrics` | `endpoint` | `get` -| `metrics` | `token` | `create`, `delete`, `get`, `list`, `revoke`, `update` -| `ml` | `mlservices` | `get` -| `monitoring` | `logs` | `get` -| `monitoring` | `metrics` | `get` -| `network` | `privateendpointservice` | `create`, `get`, `get-by-deployment-id`, `get-feature`, `update` -| `notebook` | `model` | `list` -| `notebook` | `notebook` | `create`, `delete`, `execute`, `get`, `list`, `pause`, `resume`, `update` -| `notification` | `deployment-notification` | `list`, `mark-as-read`, `mark-as-unread` -| `prepaid` | `prepaiddeployment` | `get`, `list` -| `replication` | `deploymentmigration` | `create`, `delete`, `get` -| `replication` | `deploymentreplication` | `get`, `update` -| `replication` | `deployment` | `clone-from-backup` -| `replication` | `migration-forwarder` | `upgrade-connection` -| `resourcemanager` | `organization-invite` | `create`, `delete`, `get`, `list`, `update` -| `resourcemanager` | `organization` | `delete`, `get`, `update` -| `resourcemanager` | `project` | `create`, `delete`, `get`, `list`, `update` -| `scim` | `user` | `add`, `delete`, `get`, `list`, `update` -| `security` | `iamprovider` | `create`, `delete`, `get`, `list`, `set-default`, `update` -| `security` | `ipallowlist` | `create`, `delete`, `get`, `list`, `update` - -### Permission inheritance - -Each resource (organization, project, deployment) has its own policy, but this does not mean that you have to -repeat role bindings in all these policies. - -Once you assign a role to a user (or group of users) in a policy at one level, -all the permissions of this role are inherited in lower levels - -permissions are inherited downwards from an organization to its projects and -from a project to its deployments. - -For more general permissions, which you want to be propagated to other levels, -add a role for a user/group at the organization level. -For example, if you bind the `deployment-viewer` role to user `John` in the -organization policy, `John` will have the role permissions in all projects of -that organization and all deployments of the projects. - -For more restrictive permissions, which you don't necessarily want to be -propagated to other levels, add a role at the project or even deployment level. -For example, if you bind the `deployment-viewer` role to user `John` -in a project, `John` will have the role permissions in -this project as well as in all the deployments of it, but not -in other projects of the parent organization. - -**Inheritance example** - -- Let's assume you have a group called "Deployers" which includes users who deal with deployments. -- Then you create a role "Deployment Viewer", containing - `data.deployment.get` and `data.deployment.list` permissions. -- You can now add a role binding of the "Deployers" group to the "Deployment Viewer" role. -- If you add the binding to an organization policy, members of this group - will be granted the defined permissions for the organization, all its projects and all its deployments. -- If you add the role binding to a policy of project ABC, members of this group will be granted - the defined permissions for project ABC only and its deployments, but not for - other projects and their deployments. -- If you add the role binding to a policy of deployment X, members of this - group will be granted the defined permissions for deployment X only, and not - any other deployment of the parent project or any other project of the organization. - -The "Deployment Viewer" role is effective for the following entities depending -on which policy the binding is added to: - -Role binding added to →
Role effective on ↓ | Organization policy | Project ABC's policy | Deployment X's policy of project ABC | -|:---:|:---:|:---:|:---:| -Organization, its projects and deployments | ✓ | — | — -Project ABC and its deployments | ✓ | ✓ | — -Project DEF and its deployments | ✓ | — | — -Deployment X of project ABC | ✓ | ✓ | ✓ -Deployment Y of project ABC | ✓ | ✓ | — -Deployment Z of project DEF | ✓ | — | — - -## Restricting access to organizations - -To enhance security, you can implement the following restrictions via [Oasisctl](../oasisctl/_index.md): - -1. Limit allowed authentication providers. -2. Specify an allowed domain list. - -{{< info >}} -Note that users who do not meet the restrictions will not be granted permissions for any resource in -the organization. These users can still be members of the organization. -{{< /info >}} - -Using the first option, you can limit which **authentication providers** are -accepted for users trying to access an organization in ArangoGraph. -The following commands are available to configure this option: - -- `oasisctl get organization authentication providers` - allows you to see which - authentication providers are enabled for accessing a specific organization -- `oasisctl update organization authentication providers` - allows you to update - a list of authentication providers for an organization to which the - authenticated user has access - - `--enable-github` - if set, allow access from user accounts authenticated via Github - - `--enable-google` - if set, allow access from user accounts authenticated via Google - - `--enable-username-password` - if set, allow access from user accounts - authenticated via a username/password - -Using the second option, you can configure a **list of domains**, and only users -with email addresses from the specified domains will be able to access an -organization. The following commands are available to configure this option: - -- `oasisctl get organization email domain restrictions -o ` - - allows you to see which domains are in the allowed list for a specific organization -- `oasisctl update organization email domain restrictions -o --allowed-domain= --allowed-domain=` - - allows you to update a list of the allowed domains for a specific organization -- `oasisctl update organization email domain restrictions -o --allowed-domain=` - - allows you to reset a list and accept any domains for accessing a specific organization - -## Using an audit log - -{{< info >}} -To enable the audit log feature, get in touch with the ArangoGraph team via **Request Help**, available in the left sidebar menu of the ArangoGraph Dashboard. -{{< /info >}} - -To have a better overview of the events happening in your ArangoGraph organization, -you can set up an audit log, which will track and log auditing information for you. -The audit log is created on the organization level, then you can use the log for -projects belonging to that organization. - -***To create an audit log*** - -1. In the main navigation menu, click **Access Control** in the **Organization** section. -2. Open the **Audit logs** tab and click the **New audit log** button. -3. In the dialog, fill out the following settings: - - - **Name** - enter a name for your audit log. - - **Description** - enter an optional description for your audit log. - - **Destinations** - specify one or several destinations to which you want to - upload the audit log. If you choose **Upload to cloud**, the log will be - available on the **Audit logs** tab of your organization. To send the log - entries to your custom destination, specify a destination URL with - authentication parameters (the **HTTP destination** option). - - {{< info >}} - The **Upload to cloud** option is not available for the free-to-try tier. - {{< /info >}} - - - **Excluded topics** - select topics that will not be included in the log. - Please note, that some are excluded by default (for example, `audit-document`). - - {{< warning >}} - Enabling the audit log for all events will have a negative impact on performance. - {{< /warning >}} - - - **Confirmation** - confirm that logging auditing events increases the price of your deployments. - - ![ArangoGraph audit log](../../../images/arangograph-audit-log.png) - -4. Click **Create** to add the audit log. You can now use it in the projects - belonging to your organization. diff --git a/site/content/3.10/arangograph/security-and-access-control/single-sign-on/_index.md b/site/content/3.10/arangograph/security-and-access-control/single-sign-on/_index.md deleted file mode 100644 index 1144d59ebd..0000000000 --- a/site/content/3.10/arangograph/security-and-access-control/single-sign-on/_index.md +++ /dev/null @@ -1,94 +0,0 @@ ---- -title: Single Sign-On (SSO) in ArangoGraph -menuTitle: Single Sign-On -weight: 10 -description: >- - ArangoGraph supports **Single Sign-On** (SSO) authentication using - **Security Assertion Markup language 2.0** (SAML 2.0) ---- -{{< info >}} -To enable the Single Sign-On (SSO) feature, get in touch with the ArangoGraph -team via **Request Help**, available in the left sidebar menu of the -ArangoGraph Dashboard. -{{< /info >}} - -## About SAML 2.0 - -The Security Assertion Markup language 2.0 (SAML 2.0) is an open standard created -to provide cross-domain single sign-on (SSO). It allows you to authenticate in -multiple web applications by using a single set of login credentials. - -SAML SSO works by transferring user authentication data from the identity -provider (IdP) to the service provider (SP) through an exchange of digitally -signed XML documents. - -## Configure SAML 2.0 using Okta - -You can enable SSO for your ArangoGraph organization using Okta as an Identity -Provider (IdP). For more information about Okta, please refer to the -[Okta Documentation](https://help.okta.com/en-us/Content/index.htm?cshid=csh-index). - -### Create the SAML app integration in Okta - -1. Sign in to your Okta account and select **Applications** from the left sidebar menu. -2. Click **Create App Integration**. -3. In the **Create a new app integration** dialog, select **SAML 2.0**. - - ![ArangoGraph Create Okta App Integration](../../../../images/arangograph-okta-create-integration.png) -4. In the **General Settings**, specify a name for your integration and click **Next**. - - ![ArangoGraph Okta Integration Name](../../../../images/arangograph-okta-integration-name.png) -5. Configure the SAML settings: - - For **Single sign-on URL**, use `https://auth.arangodb.com/login/callback?connection=ORG_ID` - - For **Audience URI (SP Entity ID)**, use `urn:auth0:arangodb:ORG_ID` - - ![ArangoGraph Okta SAML General Settings](../../../../images/arangograph-okta-saml-general-settings.png) - -6. Replace **ORG_ID** with your organization identifier from the - ArangoGraph Dashboard. To find your organization ID, go to the **User Toolbar** - in the top right corner, which is accessible from every view of the Dashboard, - and click **My organizations**. - - If, for example, your organization ID is 14587062, here are the values you - would use when configuring the SAML settings: - - `https://auth.arangodb.com/login/callback?connection=14587062` - - `urn:auth0:arangodb:14587062` - - ![ArangoGraph Organization ID](../../../../images/arangograph-organization-id.png) -7. In the **Attribute Statements** section, add custom attributes as seen in the image below: - - email: `user.email` - - given_name: `user.firstName` - - family_name: `user.lastName` - - picture: `user.profileUrl` - - This step consists of a mapping between the ArangoGraph attribute names and - Okta attribute names. The values of these attributes are automatically filled - in based on the users list that is defined in Okta. - - ![ArangoGraph Okta SAML Attributes](../../../../images/arangograph-okta-saml-attributes.png) -8. Click **Next**. -9. In the **Configure feedback** section, select **I'm an Okta customer adding an internal app**. -10. Click **Finish**. The SAML app integration is now created. - -### SAML Setup - -After creating the app integration, you must perform the SAML setup to finalize -the SSO configuration. - -1. Go to the **SAML Signing Certificates** section, displayed under the **Sign On** tab. -2. Click **View SAML setup instructions**. - - ![ArangoGraph Okta SAML Setup](../../../../images/arangograph-okta-saml-setup.png) -3. The setup instructions include the following items: - - **Identity Provider Single Sign-On URL** - - **Identity Provider Issuer** - - **X.509 Certificate** -4. Copy the IdP settings, download the certificate using the - **Download X.509 certificate** button, and share them with the ArangoGraph - team via an ArangoGraph Support Ticket in order to complete the SSO - configuration. - -{{< info >}} -If you would like to enable SCIM provisioning in addition to the SSO SAML -configuration, please refer to the [SCIM](scim-provisioning.md) documentation. -{{< /info >}} diff --git a/site/content/3.10/arangograph/security-and-access-control/single-sign-on/scim-provisioning.md b/site/content/3.10/arangograph/security-and-access-control/single-sign-on/scim-provisioning.md deleted file mode 100644 index 8cf40b8009..0000000000 --- a/site/content/3.10/arangograph/security-and-access-control/single-sign-on/scim-provisioning.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -title: SCIM Provisioning -menuTitle: SCIM Provisioning -weight: 5 -description: >- - How to enable SCIM provisioning with Okta for your ArangoGraph project ---- -ArangoGraph provides support to control and manage members access in -ArangoGraph organizations with the -**System for Cross-domain Identity Management** (SCIM) provisioning. -This enables you to propagate to ArangoGraph any user access changes by using -the dedicated API. - -{{< info >}} -To enable the SCIM feature, get in touch with the ArangoGraph team via -**Request Help**, available in the left sidebar menu of the ArangoGraph Dashboard. -{{< /info >}} - -## About SCIM - -[SCIM](https://www.rfc-editor.org/rfc/rfc7644), or the System -for Cross-domain Identity Management [specification](http://www.simplecloud.info/), -is an open standard designed to manage user identity information. -SCIM provides a defined schema for representing users, and a RESTful -API to run CRUD operations on these user resources. - -The SCIM specification expects the following operations so that the SSO system -can sync the information about user resources in real time: - -- `GET /Users` - List all users. -- `GET /Users/:user_id` - Get details for a given user ID. -- `POST /Users` - Invite a new user to ArangoGraph. -- `PUT /Users/:user_id` - Update a given user ID. -- `DELETE /Users/:user_id` - Delete a specified user ID. - -ArangoGraph organization administrators can generate an API key for a specific organization. -The API token consists of a key and a secret. Using this key and secret as the -Basic Authentication Header (Basic Auth) in SCIM provisioning, you can access the APIs and -manage the user resources. - -To learn how to generate a new API key in the ArangoGraph Dashboard, see the -[API Keys](../../my-account.md#api-keys) section. - -{{< info >}} -When creating an API key, it is required to select an organization from the -list. -{{< /info >}} - -## Enable SCIM provisioning in Okta - -To enable SCIM provisioning, you first need to create an SSO integration that -supports the SCIM provisioning feature. - -1. To enable SCIM provisioning for your integration, go to the **General** tab. -2. In the **App Settings** section, select **Enable SCIM provisioning**. -3. Navigate to the **Provisioning** tab. The SCIM connection settings are - displayed under **Settings > Integration**. -4. Fill in the following fields: - - For **SCIM connector base URL**, use `https://dashboard.arangodb.cloud/api/scim/v1` - - For **Unique identifier field for users**, use `userName` -5. For **Supported provisioning actions**, enable the following: - - **Import New Users and Profile Updates** - - **Push New Users** - - **Push Profile Updates** -6. From the **Authentication Mode** menu, select the **Basic Auth** option. - To authenticate using this mode, you need to provide the username and password - for the account that handles the SCIM actions - in this case ArangoGraph. -7. Go to the ArangoGraph Dashboard and create a new API key ID and Secret. - - ![ArangoGraph Create new API key](../../../../images/arangograph-okta-api-key.png) - - Make sure to select one organization from the list and do not set any - value in the **Time to live** field. For more information, - see [How to create a new API key](../../my-account.md#how-to-create-a-new-api-key). -8. Use these authentication tokens as username and password when using the - **Basic Auth** mode and click **Save**. diff --git a/site/content/3.10/arangograph/security-and-access-control/x-509-certificates.md b/site/content/3.10/arangograph/security-and-access-control/x-509-certificates.md deleted file mode 100644 index 1ef13ef4e0..0000000000 --- a/site/content/3.10/arangograph/security-and-access-control/x-509-certificates.md +++ /dev/null @@ -1,179 +0,0 @@ ---- -title: X.509 Certificates in ArangoGraph -menuTitle: X.509 Certificates -weight: 5 -description: >- - X.509 certificates in ArangoGraph are utilized for encrypted remote administration. - The communication with and between the servers of an ArangoGraph deployment is - encrypted using the TLS protocol ---- -X.509 certificates are digital certificates that are used to verify the -authenticity of a website, user, or organization using a public key infrastructure -(PKI). They are used in various applications, including SSL/TLS encryption, -which is the basis for HTTPS - the primary protocol for securing communication -and data transfer over a network. - -The X.509 certificate format is a standard defined by the -[International Telecommunication Union (ITU)](https://www.itu.int/en/Pages/default.aspx) -and contains information such as the name of the certificate holder, the public -key associated with the certificate, the certificate's issuer, and the -certificate's expiration date. An X.509 certificate can be signed by a -certificate authority (CA) or self-signed. - -ArangoGraph is using: -- **well-known X.509 certificates** created by -[Let's Encrypt](https://letsencrypt.org/) -- **self-signed X.509 certificates** created by ArangoGraph platform - -## Certificate chains - -A certificate chain, also called the chain of trust, is a hierarchical structure -that links together a series of digital certificates. The trust in the chain is -established by verifying the identity of the issuer of each certificate in the -chain. The root of the chain is a trusted third-party, such as a certificate -authority (CA). The CA issues a certificate to an organization, which in turn -can issue certificates to servers and other entities. - -For example, when you visit a website with an SSL/TLS certificate, the browser -checks the chain of trust to verify the authenticity of the digital certificate. -The browser checks to see if the root certificate is trusted, and if it is, it -trusts the chain of certificates that lead to the end-entity certificate. -If any of the certificates in the chain are invalid, expired, or revoked, the -browser does not trust the digital certificate. - -## X.509 certificates in ArangoGraph - -Each ArangoGraph deployment is accessible on different port numbers: -- default port `8529`, `443` -- high port `18529` - -Each ArangoGraph Notebook is accessible on different port numbers: -- default port `8840`, `443` -- high port `18840` - -Metrics are accessible on different port numbers: -- default port `8829`, `443` -- high port `18829` - -The distinction between these port numbers is in the certificate used for the -TLS connection. - -{{< info >}} -The default ports (`8529` and `443`) always serve the well-known certificate. -The [auto login to database UI](../deployments/_index.md#auto-login-to-database-ui) -feature is only available on the `443` port and is enabled by default. -{{< /info >}} - -### Well-known X.509 certificates - -**Well-known X.509 certificates** created by -[Let's Encrypt](https://letsencrypt.org/) are used on the -default ports, `8529` and `443`. - -This type of certificate has a lifetime of 5 years and is rotated automatically. -It is recommended to use well-known certificates, as this eases access of a -deployment in your browser. - -{{< info >}} -The well-known certificate is a wildcard certificate and cannot contain -Subject Alternative Names (SANs). To include a SAN field, which is needed -for private endpoints running on Azure, please use the self-signed certificate -option. -{{< /info >}} - -### Self-signed X.509 certificates - -**Self-signed X.509 certificates** are used on the high ports, i.e. `18529`. -This type of certificate has a lifetime of 1 year, and it is created by the -ArangoGraph platform. It is also rotated automatically before the expiration -date. - -{{< info >}} -Unless you switch off the **Use well-known certificate** option in the -certificate generation, both the default and high port serve the same -self-signed certificate. -{{< /info >}} - -### Subject Alternative Name (SAN) - -The Subject Alternative Name (SAN) is an extension to the X.509 specification -that allows you to specify additional host names for a single SSL certificate. - -When using [private endpoints](../deployments/private-endpoints.md), -you can specify custom domain names. Note that these are added **only** to -the self-signed certificate as Subject Alternative Name (SAN). - -## How to create a new certificate - -1. Click a project name in the **Projects** section of the main navigation. -2. Click **Security**. -3. In the **Certificates** section, click: - - The **New certificate** button to create a new certificate. - - A name or the **eye** icon in the **Actions** column to view a certificate. - The dialog that opens provides commands for installing and uninstalling - the certificate through a console. - - The **pencil** icon to edit a certificate. - You can also view a certificate and click the **Edit** button. - - The **tag** icon to make the certificate the new default. - - The **recycle bin** icon to delete a certificate. - -![ArangoGraph Create New Certificate](../../../images/arangograph-new-certificate.png) - -## How to install a certificate - -Certificates that have the **Use well-known certificate** option enabled do -not need any installation and are supported by almost all web browsers -automatically. - -When creating a self-signed certificate that has the **Use well-known certificate** -option disabled, the certificate needs to be installed on your local machine as -well. This operation varies between operating systems. To install a self-signed -certificate on your local machine, open the certificate and follow the -installation instructions. - -![ArangoGraph Certificates](../../../images/arangograph-cert-page-with-cert-present.png) - -![ArangoGraph Certificate Install Instructions](../../../images/arangograph-cert-install-instructions.png) - -You can also extract the information from all certificates in the chain using the -`openssl` tool. - -- For **well-known certificates**, run the following command: - ``` - openssl s_client -showcerts -servername <123456abcdef>.arangodb.cloud -connect <123456abcdef>.arangodb.cloud:8529 .arangodb.cloud -connect <123456abcdef>.arangodb.cloud:18529 ` is a placeholder that needs to be replaced with the -unique ID that is part of your ArangoGraph deployment endpoint URL. - -## How to connect to your application - -[ArangoDB drivers](../../develop/drivers/_index.md), also called connectors, allow you to -easily connect ArangoGraph deployments to your application. - -1. Navigate to **Deployments** and click the **View** button to show the - deployment page. -2. In the **Quick start** section, click the **Connecting drivers** button. -3. Select your programming language, i.e. Go, Java, Python, etc. -4. Follow the examples to connect a driver to your deployment. They include - code examples on how to use certificates in your application. - -![ArangoGraph Connecting Drivers](../../../images/arangograph-connecting-drivers.png) - -## Certificate Rotation - -Every certificate has a self-signed root certificate that is going to expire. -When certificates that are used in existing deployments are about to expire, -an automatic rotation of the certificates is triggered. This means that the -certificate is cloned (all existing settings are copied over to a new certificate) -and all affected deployments then start using the cloned certificate. - -Based on the type of certificate used, you may also need to install the new -certificate on your local machine. For example, self-signed certificates require -installation. To prevent any downtime, it is recommended to manually create a -new certificate and apply the required changes prior to the expiration date. diff --git a/site/content/3.10/components/arangodb-server/_index.md b/site/content/3.10/components/arangodb-server/_index.md deleted file mode 100644 index 82da2f3a5f..0000000000 --- a/site/content/3.10/components/arangodb-server/_index.md +++ /dev/null @@ -1,21 +0,0 @@ ---- -title: ArangoDB Server -menuTitle: ArangoDB Server -weight: 170 -description: >- - The ArangoDB daemon (arangod) is the central server binary that can run in - different modes for a variety of setups like single server and clusters ---- -The ArangoDB server is the core component of ArangoDB. The executable file to -run it is named `arangod`. The `d` stands for daemon. A daemon is a long-running -background process that answers requests for services. - -The server process serves the various client connections to the server via the -TCP/HTTP protocol. It also provides a [web interface](../web-interface/_index.md). - -_arangod_ can run in different modes for a variety of setups like single server -and clusters. It differs between the [Community Edition](../../about-arangodb/features/community-edition.md) -and [Enterprise Edition](../../about-arangodb/features/enterprise-edition.md). - -See [Administration](../../operations/administration/_index.md) for server configuration -and [Deploy](../../deploy/_index.md) for operation mode details. diff --git a/site/content/3.10/components/arangodb-server/ldap.md b/site/content/3.10/components/arangodb-server/ldap.md deleted file mode 100644 index b773edf61e..0000000000 --- a/site/content/3.10/components/arangodb-server/ldap.md +++ /dev/null @@ -1,563 +0,0 @@ ---- -title: ArangoDB Server LDAP Options -menuTitle: LDAP -weight: 10 -description: >- - LDAP authentication options in the ArangoDB server ---- -{{< tag "ArangoDB Enterprise Edition" "ArangoGraph" >}} - -## Basics Concepts - -The basic idea is that one can keep the user authentication setup for -an ArangoDB instance (single or cluster) outside of ArangoDB in an LDAP -server. A crucial feature of this is that one can add and withdraw users -and permissions by only changing the LDAP server and in particular -without touching the ArangoDB instance. Changes are effective in -ArangoDB within a few minutes. - -Since there are many different possible LDAP setups, we must support a -variety of possibilities for authentication and authorization. Here is -a short overview: - -To map ArangoDB user names to LDAP users there are two authentication -methods called "simple" and "search". In the "simple" method the LDAP bind -user is derived from the ArangoDB user name by prepending a prefix and -appending a suffix. For example, a user "alice" could be mapped to the -distinguished name `uid=alice,dc=arangodb,dc=com` to perform the LDAP -bind and authentication. -See [Simple authentication method](#simple-authentication-method) -below for details and configuration options. - -In the "search" method there are two phases. In Phase 1 a generic -read-only admin LDAP user account is used to bind to the LDAP server -first and search for an LDAP user matching the ArangoDB user name. In -Phase 2, the actual authentication is then performed against the LDAP -user that was found in phase 1. Both methods are sensible and are -recommended to use in production. -See [Search authentication method](#search-authentication-method) -below for details and configuration options. - -Once the user is authenticated, there are now two methods for -authorization: (a) "roles attribute" and (b) "roles search". - -In method (a) ArangoDB acquires a list of roles the authenticated LDAP -user has from the LDAP server. The actual access rights to databases -and collections for these roles are configured in ArangoDB itself. -Users effectively have the union of all access rights of all roles -they have. This method is probably the most common one for production use -cases. It combines the advantages of managing users and roles outside of -ArangoDB in the LDAP server with the fine grained access control within -ArangoDB for the individual roles. See [Roles attribute](#roles-attribute) -below for details about method (a) and for the associated configuration -options. - -Method (b) is very similar and only differs from (a) in the way the -actual list of roles of a user is derived from the LDAP server. -See [Roles search](#roles-search) below for details about method (b) -and for the associated configuration options. - -## Fundamental options - -The fundamental options for specifying how to access the LDAP server are -the following: - - - `--ldap.enabled` this is a boolean option which must be set to - `true` to activate the LDAP feature - - `--ldap.server` is a string specifying the host name or IP address - of the LDAP server - - `--ldap.port` is an integer specifying the port the LDAP server is - running on, the default is `389` - - `--ldap.basedn` specifies the base distinguished name under which - the search takes place (can alternatively be set via `--ldap.url`) - - `--ldap.binddn` and `--ldap.bindpasswd` are distinguished name and - password for a read-only LDAP user to which ArangoDB can bind to - search the LDAP server. Note that it is necessary to configure these - for both the "simple" and "search" authentication methods, since - even in the "simple" method, ArangoDB occasionally has to refresh - the authorization information from the LDAP server - even if the user session persists and no new authentication is - needed! It is, however, allowed to leave both empty, but then the - LDAP server must be readable with anonymous access. - - `--ldap.refresh-rate` is a floating point value in seconds. The - default is 300, which means that ArangoDB refreshes the - authorization information for authenticated users after at most 5 - minutes. This means that changes in the LDAP server like removed - users or added or removed roles for a user are effective after - at most 5 minutes. - -Note that the `--ldap.server` and `--ldap.port` options can -alternatively be specified in the `--ldap.url` string together with -other configuration options. For details see Section "LDAP URLs" below. - -Here is an example on how to configure the connection to the LDAP server, -with anonymous bind: - -``` ---ldap.enabled=true \ ---ldap.server=ldap.arangodb.com \ ---ldap.basedn=dc=arangodb,dc=com -``` - -With this configuration ArangoDB binds anonymously to the LDAP server -on host `ldap.arangodb.com` on the default port 389 and executes all searches -under the base distinguished name `dc=arangodb,dc=com`. - -If we need a user to read in LDAP here is the example for it: - -``` ---ldap.enabled=true \ ---ldap.server=ldap.arangodb.com \ ---ldap.basedn=dc=arangodb,dc=com \ ---ldap.binddn=uid=arangoadmin,dc=arangodb,dc=com \ ---ldap.bindpasswd=supersecretpassword -``` - -The connection is identical but the searches are executed with the -given distinguished name in `binddn`. - -Note here: -The given user (or the anonymous one) needs at least read access on -all user objects to find them and in the case of Roles search -also read access on the objects storing the roles. - -Up to this point ArangoDB can now connect to a given LDAP server -but it is not yet able to authenticate users properly with it. -For this pick one of the following two authentication methods. - -### LDAP URLs - -As an alternative one can specify the values of multiple LDAP related configuration -options by specifying a single LDAP URL. Here is an example: - -``` ---ldap.url ldap://ldap.arangodb.com:1234/dc=arangodb,dc=com?uid?sub -``` - -This one option has the combined effect of setting the following: - -``` ---ldap.server=ldap.arangodb.com \ ---ldap.port=1234 \ ---ldap.basedn=dc=arangodb,dc=com \ ---ldap.searchAttribute=uid \ ---ldap.searchScope=sub -``` - -That is, the LDAP URL consists of the LDAP `server` and `port`, a `basedn`, a -`searchAttribute`, and a `searchScope` which can be one of `base`, `one`, or -`sub`. There is also the possibility to use the `ldaps` protocol as in: - -``` ---ldap.url ldaps://ldap.arangodb.com:636/dc=arangodb,dc=com?uid?sub -``` - -This does exactly the same as the one above, except that it uses the -LDAP over TLS protocol. This is a non-standard method which does not -involve using the STARTTLS protocol. Note that this does not work in the -Windows version! We suggest to use the `ldap` protocol and STARTTLS -as described in the next section. - -### TLS options - -{{< warning >}} -TLS is not supported in the Windows version of ArangoDB! -{{< /warning >}} - -To configure the usage of encrypted TLS to communicate with the LDAP server -the following options are available: - -- `--ldap.tls`: the main switch to active TLS. can either be - `true` (use TLS) or `false` (do not use TLS). It is switched - off by default. If you switch this on and do not use the `ldaps` - protocol via the [LDAP URL](#ldap-urls), then ArangoDB - uses the `STARTTLS` protocol to initiate TLS. This is the - recommended approach. -- `--ldap.tls-version`: the minimal TLS version that ArangoDB should accept. - Available versions are `1.0`, `1.1` and `1.2`. The default is `1.2`. If - your LDAP server does not support Version 1.2, you have to change - this setting. -- `--ldap.tls-cert-check-strategy`: strategy to validate the LDAP server - certificate. Available strategies are `never`, `hard`, - `demand`, `allow` and `try`. The default is `hard`. -- `--ldap.tls-cacert-file`: a file path to one or more (concatenated) - certificate authority certificates in PEM format. - As default no file path is configured. This certificate - is used to validate the server response. -- `--ldap.tls-cacert-dir`: a directory path to certificate authority certificates in - [c_rehash](https://www.openssl.org/docs/man3.0/man1/c_rehash.html) - format. As default no directory path is configured. - -Assuming you have the TLS CAcert file that is given to the server at -`/path/to/certificate.pem`, here is an example on how to configure TLS: - -``` ---ldap.tls true \ ---ldap.tls-cacert-file /path/to/certificate.pem -``` - -You can use TLS with any of the following authentication mechanisms. - -### Secondary server options (`ldap2`) - -The `ldap.*` options configure the primary LDAP server. It is possible to -configure a secondary server with the `ldap2.*` options to use it as a -fail-over for the case that the primary server is not reachable, but also to -let the primary servers handle some users and the secondary others. - -Instead of `--ldap.