Glue Table's schema, partition and index is not updating, after new file in s3 is added and re-ran the crawler #2322
Answered
by
jmklix
Sach1nAgarwal
asked this question in
Q&A
Replies: 3 comments
-
@jmklix Do I need to delete the old table for getting the updated table?? |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
jmklix
-
Hello! Reopening this discussion to make it searchable. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Suppose some set of parquet files present in S3 folder, has same metadata, then AWS Glue will show the one table (containing schema, partition and index) for all files after running the crawler for that folder.
Case1:-
Suppose if a new .txt file (or some different type of file, or parquet with different metadata (different column names)) is added in same S3 folder, then after re-running the crawler, nothing changes in table's schemas, partitions and indexes. Which is not correct?
So AWS Glue is showing wrong behavior ?
Means If I created a new crawler for the S3 folder after uploading the .txt file, then glue crawler will generate n number of tables, where n will be equal to number of files present in the S3. So this is correct behavior of AWS Glue. But same behavior is not seen when re-running the crawler in above case1.
@jmklix
Beta Was this translation helpful? Give feedback.
All reactions