Skip to content

Commit 8275f70

Browse files
committed
Starts databricks connect
1 parent 42fa955 commit 8275f70

File tree

1 file changed

+22
-52
lines changed

1 file changed

+22
-52
lines changed

assets/slides/units/databricks-connect.qmd

+22-52
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Databricks <br/> Connect
2222
:::{.columns}
2323
:::{.column width="50%"}
2424

25-
:::{.custom-subtitle}
25+
:::{.custom-smaller}
2626
:::{.incremental1}
2727
- Spark Connect, offers **true** remote connectivity
2828
- Uses **gRPC** to as the communication interface
@@ -37,22 +37,10 @@ Databricks <br/> Connect
3737

3838
![](assets/databricks-connect/grpc){.absolute top="264" left="900" width="670"}
3939

40-
4140
## [Databricks Connect]{style="color:#666;"} {background-image="assets/background/slide-light.svg" background-size="1700px" background-color="white"}
4241

43-
:::{.columns}
44-
:::{.column width="37%"}
45-
46-
:::{.custom-subtitle}
47-
:::{.incremental1}
48-
- `databricks-connect` integrates with gRPC, by wrapping `pyspark`
49-
- `pyspark` is the most developed **Spark Connect** interface
50-
:::
51-
:::
52-
53-
:::
54-
:::{.column width="60%"}
55-
:::
42+
:::{.custom-subtitle .custom-smaller}
43+
`databricks-connect` integrates with gRPC via `pyspark`
5644
:::
5745

5846
![](assets/databricks-connect/python.png){.absolute top="264" left="572" width="998"}
@@ -61,29 +49,23 @@ Databricks <br/> Connect
6149

6250
![](assets/posit-databricks.png){.absolute top="-10" left="1430" width="180"}
6351

64-
:::{.columns}
65-
:::{.column width="4%"}
66-
:::
67-
:::{.column width="96%"}
68-
[`sparklyr` integrates with `databricks-connect` via `reticulate`]{style="font-size:54px;line-height:1;font-weight:400;color:#666;"}
69-
:::
52+
:::{.custom-subtitle}
53+
`sparklyr` integrates with `databricks-connect` via `reticulate`
7054
:::
7155

72-
73-
7456
![](assets/databricks-connect/db-connect.png){.absolute top="200" left="70" width="1500"}
7557

7658
## [Why not just use 'reticulate'?]{style="color:#666;"} {background-image="assets/background/slide-light.svg" background-size="1700px" background-color="white"}
7759

78-
![](assets/posit-databricks.png){.absolute top="-10" left="1430" width="180"}
79-
80-
[**sparklyr** extends functionality and user experience:]{style="font-size:65px;line-height:1;font-weight:400;color:#666;"}
81-
60+
:::{.custom-subtitle}
61+
`sparklyr` extends functionality and user experience
62+
:::
8263

8364
:::{.columns}
84-
:::{.column width="45%"}
85-
86-
:::{.custom-subtitle}
65+
:::{.column width="20%"}
66+
:::
67+
:::{.column width="70%"}
68+
:::{.custom-smaller}
8769
:::{.incremental1}
8870
- `dplyr` back-end
8971
- `DBI` back-end
@@ -92,34 +74,19 @@ Databricks <br/> Connect
9274
:::
9375
:::
9476

95-
:::
96-
:::{.column width="55%"}
97-
:::{.code-slim-35}
98-
```r
99-
library(sparklyr)
100-
sc <- spark_connect(method = "databricks_connect")
101-
102-
trips <- tbl(sc, I("samples.nyctaxi.trips"))
103-
104-
trips |>
105-
group_by(pickup_zip) |>
106-
summarise(
107-
count = n(),
108-
avg_distance = mean(trip_distance)
109-
)
110-
```
111-
:::
11277
:::
11378
:::
11479

80+
![](assets/posit-databricks.png){.absolute top="-10" left="1430" width="180"}
81+
11582
## [Getting started]{style="color:#666;"} {background-image="assets/background/slide-light.svg" background-size="1700px" background-color="white"}
11683

11784
![](assets/posit-databricks.png){.absolute top="-10" left="1430" width="180"}
11885

11986
:::{.columns}
12087
:::{.column width="42%"}
12188

122-
:::{.custom-subtitle}
89+
:::{.custom-smaller}
12390
:::{.incremental1}
12491
- Python 3.10+
12592
- A Python environment with `databricks-connect` and its dependencies
@@ -129,14 +96,17 @@ trips |>
12996

13097
:::
13198
:::{.column width="58%"}
132-
:::{.code-slim-35}
99+
:::{.custom-smaller}
100+
<br/>
133101
```r
134102
install.packages("pysparklyr")
103+
135104
library(sparklyr)
105+
136106
sc <- spark_connect(
137-
cluster_id = "1026-175310-7cpsh3g8",
138-
method = "databricks_connect"
139-
)
107+
cluster_id = "[cluster's id]",
108+
method = "databricks_connect"
109+
)
140110
```
141111
:::
142112
:::

0 commit comments

Comments
 (0)