Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

139: update circle ci to test spark 3 #145

Closed
wants to merge 1 commit into from

Conversation

chinwobble
Copy link

resolves #139

Description

Test circle CI to also test spark 3.
The intention is to add integration tests to make sure that csv seeding works with both spark 2.4x.x and spark 3.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot
Copy link

cla-bot bot commented Jan 20, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Benney Au.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Jan 20, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Benney Au.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Jan 20, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Benney Au.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Jan 20, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Benney Au.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Jan 20, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Benney Au.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Jan 24, 2021

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin.

CLA has not been signed by users: @chinwobble

@jtcohen6
Copy link
Contributor

jtcohen6 commented Feb 2, 2021

I'm not exactly sure what's going on to cause the CI timeout, but it must be related to these error messages, coped from Container godatadriven/spark:3.0.0:

21/02/02 16:57:44 INFO SparkExecuteStatementOperation: Submitting query '/* {"app": "dbt", "dbt_version": "0.18.1", "profile_name": "dbt-pytest", "target_name": "default", "node_id": "seed.dbt_test_project.base"} */

    create table analytics_210202165735162780640769.base (id bigint,name string,some_date timestamp)
    
    
    
    
    
  ' with d872ff15-cb78-45c9-a376-b3a4314e6bfe
21/02/02 16:57:44 INFO SparkExecuteStatementOperation: Running query with d872ff15-cb78-45c9-a376-b3a4314e6bfe
21/02/02 16:57:44 INFO HiveMetaStore: 9: get_database: analytics_210202165735162780640769
21/02/02 16:57:44 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_database: analytics_210202165735162780640769	
21/02/02 16:57:45 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
21/02/02 16:57:45 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
21/02/02 16:57:45 INFO HiveMetaStore: 9: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
21/02/02 16:57:45 INFO ObjectStore: ObjectStore, initialize called
21/02/02 16:57:45 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is POSTGRES
21/02/02 16:57:45 INFO ObjectStore: Initialized ObjectStore
21/02/02 16:57:45 INFO HiveMetaStore: 9: get_table : db=analytics_210202165735162780640769 tbl=base
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_table : db=analytics_210202165735162780640769 tbl=base	
21/02/02 16:57:45 INFO HiveMetaStore: 9: get_database: analytics_210202165735162780640769
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_database: analytics_210202165735162780640769	
21/02/02 16:57:45 INFO HiveMetaStore: 9: get_database: analytics_210202165735162780640769
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_database: analytics_210202165735162780640769	
21/02/02 16:57:45 INFO HiveMetaStore: 9: get_database: analytics_210202165735162780640769
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_database: analytics_210202165735162780640769	
21/02/02 16:57:45 INFO HiveMetaStore: 9: get_table : db=analytics_210202165735162780640769 tbl=base
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_table : db=analytics_210202165735162780640769 tbl=base	
21/02/02 16:57:45 INFO HiveMetaStore: 9: get_database: analytics_210202165735162780640769
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=get_database: analytics_210202165735162780640769	
21/02/02 16:57:45 WARN ShellBasedUnixGroupsMapping: got exception trying to get groups for user dbt: id: ‘dbt’: no such user
id: ‘dbt’: no such user

21/02/02 16:57:45 INFO SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=8331d230-2f89-40b1-8003-565d3fe21aac, clientType=HIVECLI]
21/02/02 16:57:45 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
21/02/02 16:57:45 INFO metastore: Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
21/02/02 16:57:45 INFO HiveMetaStore: 9: Cleaning up thread local RawStore...
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=Cleaning up thread local RawStore...	
21/02/02 16:57:45 INFO HiveMetaStore: 9: Done cleaning up thread local RawStore
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=Done cleaning up thread local RawStore	
21/02/02 16:57:45 INFO HiveMetaStore: 9: create_table: Table(tableName:base, dbName:analytics_210202165735162780640769, owner:root, createTime:1612285064, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:bigint, comment:null), FieldSchema(name:name, type:string, comment:null), FieldSchema(name:some_date, type:timestamp, comment:null)], location:file:/spark-warehouse/analytics_210202165735162780640769.db/base, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{spark.sql.sources.schema.part.0={"type":"struct","fields":[{"name":"id","type":"long","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}},{"name":"some_date","type":"timestamp","nullable":true,"metadata":{}}]}, spark.sql.sources.schema.numParts=1, spark.sql.create.version=3.0.0}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{dbt=[PrivilegeGrantInfo(privilege:INSERT, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:SELECT, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:UPDATE, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:DELETE, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true)]}, groupPrivileges:null, rolePrivileges:null))
21/02/02 16:57:45 INFO audit: ugi=dbt	ip=unknown-ip-addr	cmd=create_table: Table(tableName:base, dbName:analytics_210202165735162780640769, owner:root, createTime:1612285064, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:bigint, comment:null), FieldSchema(name:name, type:string, comment:null), FieldSchema(name:some_date, type:timestamp, comment:null)], location:file:/spark-warehouse/analytics_210202165735162780640769.db/base, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{spark.sql.sources.schema.part.0={"type":"struct","fields":[{"name":"id","type":"long","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}},{"name":"some_date","type":"timestamp","nullable":true,"metadata":{}}]}, spark.sql.sources.schema.numParts=1, spark.sql.create.version=3.0.0}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{dbt=[PrivilegeGrantInfo(privilege:INSERT, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:SELECT, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:UPDATE, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:DELETE, createTime:-1, grantor:dbt, grantorType:USER, grantOption:true)]}, groupPrivileges:null, rolePrivileges:null))	
21/02/02 16:57:45 WARN HiveConf: HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist
21/02/02 16:57:45 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
21/02/02 16:57:45 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
21/02/02 16:57:45 INFO HiveMetaStore: 9: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
21/02/02 16:57:45 INFO ObjectStore: ObjectStore, initialize called
21/02/02 16:57:45 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is POSTGRES
21/02/02 16:57:45 INFO ObjectStore: Initialized ObjectStore
21/02/02 16:57:45 WARN HiveMetaStore: Location: file:/spark-warehouse/analytics_210202165735162780640769.db/base specified for non-external table:base
21/02/02 16:57:45 INFO FileUtils: Creating directory if it doesn't exist: file:/spark-warehouse/analytics_210202165735162780640769.db/base

After that point, it seems to hang indefinitely.

@github-actions
Copy link
Contributor

This PR has been marked as Stale because it has been open for 180 days with no activity. If you would like the PR to remain open, please remove the stale label or comment on the PR, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Apr 12, 2022
@github-actions github-actions bot closed this Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load CSV files fails on date and numeric types
2 participants