apache · sagarlakshmipathy · Sep 30, 2024 · Sep 30, 2024 · Sep 30, 2024 · vinishjail97
diff --git a/website/docs/glue-catalog.md b/website/docs/glue-catalog.md
@@ -99,6 +99,7 @@ From your terminal, create a glue database.
  aws glue create-database --database-input "{\"Name\":\"xtable_synced_db\"}"
  ```
 
+#### Method 1: Using Glue Crawler
 From your terminal, create a glue crawler. Modify the `<yourAccountId>`, `<yourRoleName>` 
 and `<path/to/your/data>`, with appropriate values.
 
@@ -149,6 +150,47 @@ From your terminal, run the glue crawler.
 Once the crawler succeeds, you’ll be able to query this Iceberg table from Athena,
 EMR and/or Redshift query engines.
 
+
+#### Method 2: Using XTable APIs to sync with AWS Glue Data Catalog directly
+This applies for Iceberg target format only.
+
+**Pre-requisites:**
+* Download iceberg-aws-X.X.X.jar from the [Maven repository](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws)
+* Download bundle-X.X.X.jar from the [Maven repository](https://mvnrepository.com/artifact/software.amazon.awssdk/bundle)
+
+Create a `glue-sync-config.yaml` file:
+
+```yaml md title="yaml"
+sourceFormat: HUDI|DELTA # choose only one
+targetFormats:
+   - ICEBERG
+datasets:
+   -
+      tableBasePath: s3://path/to/source/data
+      tableName: table_name
+      partitionSpec: partitionpath:VALUE
+      namespace: xtable_synced_db
+```
+
+Create a `glue-sync-catalog.yaml` file:
+
+```yaml md title="yaml"
+catalogImpl: org.apache.iceberg.aws.glue.GlueCatalog
+catalogName: <catalog_name>
+catalogOptions:
+   io-impl: org.apache.iceberg.aws.s3.S3FileIO
+   warehouse: s3://path/to/source
+```
+
+Sample command to sync the table with Glue Data Catalog:
+
+```shell md title="shell"
+java -cp /path/to/xtable-utilities-0.2.0-SNAPSHOT-bundled.jar:/path/to/iceberg-aws-1.3.1.jar:/path/to/bundle-2.23.9.jar org.apache.xtable.utilities.RunSync  --datasetConfig glue-sync-config.yaml --icebergCatalogConfig glue-sync-catalog.yaml
+```
+### Validating the results
+Once the sync is complete (or in case of Glue Crawler option, once the crawler succeeds) you can inspect the catalogued tables in Glue
+and also query the table in Amazon Athena like below:
+
 <Tabs
 groupId="table-format"
 defaultValue="hudi"
@@ -169,20 +211,14 @@ supports Hudi version 0.14.0 as mentioned [here](/docs/features-and-limitations#
 </TabItem>
 <TabItem value="delta">
 
-### Validating the results
-After the crawler runs successfully, you can inspect the catalogued tables in Glue
-and also query the table in Amazon Athena like below:
-
 ```sql
 SELECT * FROM xtable_synced_db.<table_name>;
 ```
 
 </TabItem>
 <TabItem value="iceberg">
 
-### Validating the results
-After the crawler runs successfully, you can inspect the catalogued tables in Glue
-and also query the table in Amazon Athena like below:
+
 
 ```sql
 SELECT * FROM xtable_synced_db.<table_name>;

diff --git a/website/docs/snowflake.md b/website/docs/snowflake.md
@@ -8,11 +8,6 @@ title: "Snowflake"
 Currently, Snowflake supports [Iceberg tables through External Tables](https://www.snowflake.com/blog/expanding-the-data-cloud-with-apache-iceberg/)
 and also [Native Iceberg Tables](https://www.snowflake.com/blog/iceberg-tables-powering-open-standards-with-snowflake-innovations/).
 
-:::note NOTE:
-Iceberg on Snowflake is currently supported in
-[public preview](https://www.snowflake.com/blog/build-open-data-lakehouse-iceberg-tables/)
-:::
-
 ## Steps:
 These are high level steps to help you integrate Apache XTable™ (Incubating) synced Iceberg tables on Snowflake. For more additional information
 refer to the [Getting started with Iceberg tables](https://docs.snowflake.com/LIMITEDACCESS/iceberg-2023/tables-iceberg-getting-started).
@@ -47,7 +42,7 @@ TABLE_FORMAT=ICEBERG
 ENABLED=TRUE;
 ```
 
-### Create an Iceberg table from Iceberg metadata in object storage
+### Method 1: Create an Iceberg table from Iceberg metadata in object storage
 Refer to additional [examples](https://docs.snowflake.com/LIMITEDACCESS/iceberg-2023/create-iceberg-table#examples) 
 in the Snowflake Create Iceberg Table guide for more information.
 
@@ -58,4 +53,45 @@ CATALOG=<catalog_name>
 METADATA_FILE_PATH='path/to/metadata/<VERSION>.metadata.json';
 ```
 
-Once the table creation succeeds you can start using the Iceberg table as any other table in Snowflake.
+Once the table creation succeeds you can start using the Iceberg table as any other table in Snowflake.
+
+### Method 2: Using XTable APIs to sync with Snowflake Catalog directly
+
+#### Pre-requisites:
+
+* Build Apache XTable™ (Incubating) from [source](https://github.com/apache/incubator-xtable)
+* Download `iceberg-aws-X.X.X.jar` from the [Maven repository](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws)
+* Download `bundle-X.X.X.jar` from the [Maven repository](https://mvnrepository.com/artifact/software.amazon.awssdk/bundle)
+* Download `iceberg-spark-runtime-3.X_2.12/X.X.X.jar` from [here](https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-spark-runtime-3.2_2.12/1.4.2/)
+* Download `snowflake-jdbc-X.X.X.jar` from the [Maven repository](https://mvnrepository.com/artifact/net.snowflake/snowflake-jdbc)
+
+Create a `snowflake-sync-config.yaml` file:
+
+```yaml md title="yaml"
+sourceFormat: DELTA
+targetFormats:
+  - ICEBERG
+datasets:
+  -
+    tableBasePath: s3://path/to/table
+    tableName: <table_name>
+    namespace: <db_name>.<schema_name>
+```
+
+Create a `snowflake-sync-catalog.yaml` file:
+
+```yaml md title="yaml"
+catalogImpl: org.apache.iceberg.snowflake.SnowflakeCatalog
+catalogName: <catalog_name>
+catalogOptions:
+  io-impl: org.apache.iceberg.aws.s3.S3FileIO
+  warehouse: s3://path/to/table
+  uri: jdbc:snowflake://<account-identifier>.snowflakecomputing.com
+  jdbc.user: <snowflake-username>
+  jdbc.password: <snowflake-password>
+```
+
+Sample command to sync the table with Snowflake:
+```shell md title="shell"
+java -cp /path/to/iceberg-spark-runtime-3.2_2.12-1.4.2.jar:/path/to/xtable-utilities-0.2.0-SNAPSHOT-bundled.jar:/path/to/snowflake-jdbc-3.13.28.jar:/path/to/iceberg-aws-1.4.2.jar:/Users/sagarl/Downloads/bundle-2.23.9.jar org.apache.xtable.utilities.RunSync  --datasetConfig snowflake-sync-config.yaml --icebergCatalogConfig snowflake-sync-catalog.yaml
+```