Skip to content

Commit 2185551

Browse files
authored
[04b6IYYB] Fix sampling documentation for apoc.meta.* procs (#3442)
1 parent d4d4fbd commit 2185551

13 files changed

+118
-53
lines changed

docs/asciidoc/modules/ROOT/pages/overview/apoc.meta/apoc.meta.data.adoc

+8-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,14 @@ apoc.meta.data(config = {} :: MAP?) :: (label :: STRING?, property :: STRING?, c
2525
|===
2626

2727
== Config parameters
28-
include::partial$usage/config/apoc.meta.data.adoc[]
28+
This procedure supports the following config parameters:
29+
30+
.Config parameters
31+
[opts=header]
32+
|===
33+
| Name | Type | Default | Description
34+
| sample | Long | 1000 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node.
35+
|===
2936

3037
== Output parameters
3138
[.procedures, opts=header]

docs/asciidoc/modules/ROOT/pages/overview/apoc.meta/apoc.meta.data.of.adoc

+10-2
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,16 @@ apoc.meta.data.of(graph :: ANY?, config = {} :: MAP?) :: (label :: STRING?, prop
2525
|config|MAP?|{}
2626
|===
2727

28-
== Config parameters
29-
include::partial$usage/config/apoc.meta.data.of.adoc[]
28+
== Config Parameters
29+
This procedure supports the following config parameters:
30+
31+
.Config parameters
32+
[opts=header]
33+
|===
34+
| Name | Type | Default | Description
35+
| sample | Long | 1000 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node.
36+
|===
37+
3038

3139
== Output parameters
3240
[.procedures, opts=header]

docs/asciidoc/modules/ROOT/pages/overview/apoc.meta/apoc.meta.graph.adoc

+4
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ include::partial$usage/config/apoc.meta.graph.adoc[]
3535
|relationships|LIST? OF RELATIONSHIP?
3636
|===
3737

38+
[[sampling-apoc.meta.graph]]
39+
== Sampling
40+
include::partial$usage/apoc.meta.samplingDesc.adoc[]
41+
3842
[[usage-apoc.meta.graph]]
3943
== Usage Examples
4044
include::partial$usage/apoc.meta.graph.adoc[]

docs/asciidoc/modules/ROOT/pages/overview/apoc.meta/apoc.meta.graph.of.adoc

+5-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ apoc.meta.graph.of(graph = {} :: ANY?, config = {} :: MAP?) :: (nodes :: LIST? O
2626
|===
2727

2828
== Config parameters
29-
include::partial$usage/config/apoc.meta.graph.of.adoc[]
29+
include::partial$usage/config/apoc.meta.graph.adoc[]
3030

3131
== Output parameters
3232
[.procedures, opts=header]
@@ -36,6 +36,10 @@ include::partial$usage/config/apoc.meta.graph.of.adoc[]
3636
|relationships|LIST? OF RELATIONSHIP?
3737
|===
3838

39+
[[sampling-apoc.meta.graph]]
40+
== Sampling
41+
include::partial$usage/apoc.meta.samplingDesc.adoc[]
42+
3943
[[usage-apoc.meta.graph.of]]
4044
== Usage Examples
4145
include::partial$usage/apoc.meta.graph.of.adoc[]

docs/asciidoc/modules/ROOT/pages/overview/apoc.meta/apoc.meta.schema.adoc

+8-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,14 @@ apoc.meta.schema(config = {} :: MAP?) :: (value :: MAP?)
2525
|===
2626

2727
== Config parameters
28-
include::partial$usage/config/apoc.meta.schema.adoc[]
28+
This procedure supports the following config parameters:
29+
30+
.Config parameters
31+
[opts=header]
32+
|===
33+
| Name | Type | Default | Description
34+
| sample | Long | 1000 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node.
35+
|===
2936

3037
== Output parameters
3138
[.procedures, opts=header]

docs/asciidoc/modules/ROOT/pages/overview/apoc.meta/apoc.meta.subGraph.adoc

+5-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ apoc.meta.subGraph(config :: MAP?) :: (nodes :: LIST? OF NODE?, relationships ::
2525
|===
2626

2727
== Config parameters
28-
include::partial$usage/config/apoc.meta.subGraph.adoc[]
28+
include::partial$usage/config/apoc.meta.graph.adoc[]
2929

3030
== Output parameters
3131
[.procedures, opts=header]
@@ -35,6 +35,10 @@ include::partial$usage/config/apoc.meta.subGraph.adoc[]
3535
|relationships|LIST? OF RELATIONSHIP?
3636
|===
3737

38+
[[sampling-apoc.meta.graph]]
39+
== Sampling
40+
include::partial$usage/apoc.meta.samplingDesc.adoc[]
41+
3842
[[usage-apoc.meta.subGraph]]
3943
== Usage Examples
4044
include::partial$usage/apoc.meta.subGraph.adoc[]

docs/asciidoc/modules/ROOT/partials/usage/apoc.meta.nodes.count.adoc

+4-4
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,11 @@ RETURN apoc.meta.nodes.count(['MyCountLabel', 'ThirdLabel']) AS count;
2222
|===
2323

2424

25-
We can return all nodes with a label `MyCountLabel` and a relationship `MY_COUNT_REL` through the config param `rel`
25+
We can return all nodes with a label `MyCountLabel` and a relationship `MY_COUNT_REL` through the config param `includeRels`
2626

2727
[source,cypher]
2828
----
29-
RETURN apoc.meta.nodes.count(['MyCountLabel'], {rels: ['MY_COUNT_REL']}) AS count;
29+
RETURN apoc.meta.nodes.count(['MyCountLabel'], {includeRels: ['MY_COUNT_REL']}) AS count;
3030
----
3131

3232
.Results
@@ -40,7 +40,7 @@ Moreover, we can return all nodes with a `outcome` relationship `MY_COUNT_REL` (
4040

4141
[source,cypher]
4242
----
43-
RETURN apoc.meta.nodes.count(['MyCountLabel'], {rels: ['MY_COUNT_REL>']}) AS count;
43+
RETURN apoc.meta.nodes.count(['MyCountLabel'], {includeRels: ['MY_COUNT_REL>']}) AS count;
4444
----
4545

4646
.Results
@@ -54,7 +54,7 @@ otherwise with an `incoming` relationship `MY_COUNT_REL` (with the suffix `<`):
5454

5555
[source,cypher]
5656
----
57-
RETURN apoc.meta.nodes.count(['MyCountLabel'], {rels: ['MY_COUNT_REL<']}) AS count;
57+
RETURN apoc.meta.nodes.count(['MyCountLabel'], {includeRels: ['MY_COUNT_REL<']}) AS count;
5858
----
5959

6060
.Results
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
This procedure works by using the database statistics. A new node is returned for each label, and its connecting
2+
relationships are calculated based on the pairing combinations of [:R]->(:N) and (:M)->[:R]. For example, for the graph
3+
(:A)-[:R]->(:B)-[:R]->(:C), the path (:B)-[:R]->(:B) will be calculated from the combination of [:R]->(:B) and (:B)-[:R].
4+
This procedure will post-process the data by default, removing all non-existing relationships.
5+
This is done by scanning the nodes and their relationships.
6+
If the relationship is not found, it is removed from the final result.
7+
This slows down the procedure, but will produce an accurate schema.
8+
9+
See xref::overview/apoc.meta/apoc.meta.graphSample.adoc[apoc.meta.graphSample] to avoid performing any post-processing.
10+
11+
It is also possible to specify how many nodes and relationships to scan. The config parameter `sample` gives the skip count,
12+
and the `maxRels` parameter gives the max number of relationships that will be checked per node.
13+
If `sample` is set to 100, this means that every 100th node will be checked per label,
14+
and a value of 100 for `maxRels` means that for each node read, only the first 100 relationships will be read.
15+
Note that if these values are set, and the relationship is not found within those constraints,
16+
it is assumed that the relationship does not exist, and this may result in false negatives.
17+
18+
A `sample` value higher than the number of nodes for that label will result in one node being checked.
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,22 @@
1-
The procedure support the following config parameters:
1+
This procedure supports the following config parameters:
22

33
.Config parameters
44
[opts=header]
55
|===
6-
| name | type | default | description
7-
| sample | Long | 1000 | number of nodes to sample per label. See "Sampling" section below.
6+
| Name | Type | Default | Description
7+
| includeLabels | List<String> | [] | Node labels to include. Default is to include all node labels.
8+
| includeRels | List<String> | [] | Relationship types to include. Default is to include all relationship types.
9+
| excludeLabels | List<String> | [] | Node labels to exclude. Default is to include all node labels.
10+
| excludeRels | List<String> | [] | Relationship types to exclude. Default is to include all relationship types.
11+
| sample | Long | 1000 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node.
12+
| maxRels | Long | 100 | Number of relationships to read per sampled node.
813
|===
914

10-
include::partial$usage/config/sample.config.adoc[]
15+
.Deprecated parameters
16+
[opts=header]
17+
|===
18+
| Name | Type | Default | Description
19+
| labels | List<String> | [] | Deprecated, use `includeLabels`.
20+
| rels | List<String> | [] | Deprecated, use `includeRels`.
21+
| excludes | List<String> | [] | Deprecated, use `excludeLabels`.
22+
|===
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,9 @@
1-
The procedure support the following config parameters:
1+
This procedure supports the following config parameters:
22

33
.Config parameters
44
[opts=header]
55
|===
6-
| name | type | default | description
7-
| sample | Long | 1000 | number of nodes to sample per label. See "Sampling" section below.
8-
| maxRels | Long | 100 | number of relationships to be analyzed, by type of relationship and start and end label, in order to remove / add relationships incorrectly inserted / not inserted by the sample result.
9-
|===
10-
11-
include::partial$usage/config/sample.config.adoc[]
6+
| Name | Type | Default | Description
7+
| sample | Long | 1 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node. Defaults to read every node.
8+
| maxRels | Long | -1 | Number of relationships to read per sampled node. A value of -1 will read all.
9+
|===
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,22 @@
1-
The procedure support the following config parameters:
1+
This procedure supports the following config parameters:
22

33
.Config parameters
44
[opts=header]
55
|===
6-
| name | type | default | description
7-
| includeLabels | List<String> | [] | labels to include. Default is to include all labels
8-
| includeRels | List<String> | [] | relationship types to include. Default is to include all relationship types
9-
| excludeLabels | List<String> | [] | labels to exclude. Default is to not exclude any label
10-
| excludeRels | List<String> | [] | relationship types to exclude. Default is to not exclude any relationship type
11-
| sample | Long | 1000 | number of nodes to sample per label. See "Sampling" section below.
12-
| maxRels | Long | 100 | number of relationships to sample per relationship type
6+
| Name | Type | Default | Description
7+
| includeLabels | List<String> | [] | Node labels to include. Default is to include all node labels.
8+
| includeRels | List<String> | [] | Relationship types to include. Default is to include all relationship types.
9+
| excludeLabels | List<String> | [] | Node labels to exclude. Default is to include all node labels.
10+
| excludeRels | List<String> | [] | Relationship types to exclude. Default is to include all relationship types.
11+
| sample | Long | 1000 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node.
12+
| maxRels | Long | 100 | Number of relationships to read per sampled node.
1313
|===
1414

15-
include::partial$usage/config/sample.config.adoc[]
16-
1715
.Deprecated parameters
1816
[opts=header]
1917
|===
20-
| name | type | default | description
21-
| labels | List<String> | [] | deprecated, use `includeLabels`
22-
| rels | List<String> | [] | deprecated, use `includeRels`
23-
| excludes | List<String> | [] | deprecated, use `excludeLabels`
18+
| Name | Type | Default | Description
19+
| labels | List<String> | [] | Deprecated, use `includeLabels`.
20+
| rels | List<String> | [] | Deprecated, use `includeRels`.
21+
| excludes | List<String> | [] | Deprecated, use `excludeLabels`.
2422
|===
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,16 @@
1-
The procedure support the following config parameters:
1+
This procedure supports the following config parameters:
22

33
.Config parameters
44
[opts=header, cols="1,1,1,5"]
55
|===
6-
| name | type | default | description
7-
| rels | Set<String> | `EmptySet` | The rel types to consider in the count.
8-
We can add to the suffix `>` or `<` to the rel type name to indicate an outgoing or incoming relationship.
6+
| Name | Type | Default | Description
7+
| includeRels | List<String> | [] | Relationship types to include. Default is to include all relationship types.
8+
Add the suffix `>` or `<` to the relationship type name to indicate an outgoing or incoming relationship.
9+
|===
10+
11+
.Deprecated parameters
12+
[opts=header]
13+
|===
14+
| Name | Type | Default | Description
15+
| rels | List<String> | [] | deprecated, use `includeRels`
916
|===
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,22 @@
1-
The procedure support the following config parameters:
1+
This procedure supports the following config parameters:
22

33
.Config parameters
44
[opts=header]
55
|===
6-
| name | type | default | description
7-
| includeLabels | List<String> | [] | labels to include. Default is to include all labels
8-
| includeRels | List<String> | [] | relationship types to include. Default is to include all relationship types
9-
| excludeLabels | List<String> | [] | labels to exclude. Default is to not exclude any label
10-
| excludeRels | List<String> | [] | relationship types to exclude. Default is to not exclude any relationship type
11-
| sample | Long | 1000 | number of nodes to sample per label. See "Sampling" section below.
12-
| maxRels | Long | 100 | number of relationships to sample per relationship type
6+
| Name | Type | Default | Description
7+
| includeLabels | List<String> | [] | Node labels to include. Default is to include all node labels.
8+
| includeRels | List<String> | [] | Relationship types to include. Default is to include all relationship types.
9+
| excludeLabels | List<String> | [] | Node labels to exclude. Default is to include all node labels.
10+
| excludeRels | List<String> | [] | Relationship types to exclude. Default is to include all relationship types.
11+
| sample | Long | 1000 | Number of nodes to skip, e.g. a sample of 1000 will read every 1000th node.
12+
| maxRels | Long | 100 | Number of relationships to read per sampled node.
1313
|===
1414

15-
include::partial$usage/config/sample.config.adoc[]
16-
1715
.Deprecated parameters
1816
[opts=header]
1917
|===
20-
| name | type | default | description
21-
| labels | List<String> | [] | deprecated, use `includeLabels`
22-
| rels | List<String> | [] | deprecated, use `includeRels`
23-
| excludes | List<String> | [] | deprecated, use `excludeLabels`
18+
| Name | Type | Default | Description
19+
| labels | List<String> | [] | Deprecated, use `includeLabels`.
20+
| rels | List<String> | [] | Deprecated, use `includeRels`.
21+
| excludes | List<String> | [] | Deprecated, use `excludeLabels`.
2422
|===

0 commit comments

Comments
 (0)