Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: aws_dms_replication_task failed with error InvalidParameterValueException: TimestampColumnName cannot be an empty string. #27283

Closed
thaiphv opened this issue Oct 18, 2022 · 18 comments · Fixed by #28704
Assignees
Labels
breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. bug Addresses a defect in current functionality. service/dms Issues and PRs that pertain to the dms service.
Milestone

Comments

@thaiphv
Copy link

thaiphv commented Oct 18, 2022

Terraform Core Version

1.3.2

AWS Provider Version

4.35.0

Affected Resource(s)

  • aws_dms_replication_task

Expected Behavior

The Terraform provider should create a task successfully.

Actual Behavior

The Terraform provider failed to create the resource and reported the error: InvalidParameterValueException: TimestampColumnName cannot be an empty string.

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

resource "aws_dms_endpoint" "source" {
  endpoint_id = "dms-source"

  database_name = var.source_db_name
  endpoint_type = "source"
  engine_name   = "sqlserver"
  username      = var.source_db_username
  password      = var.source_db_password
  server_name   = var.source_db_server_name
  port          = var.source_db_server_port
}

resource "aws_dms_endpoint" "fullload" {
  endpoint_id = "fullload-task"

  endpoint_type = "target"
  engine_name   = "s3"

  s3_settings {
    add_column_name   = true
    bucket_name       = var.bucket
    bucket_folder     = var.s3_folder
    compression_type  = "NONE"
    csv_delimiter     = ","
    csv_row_delimiter = "\\n"
    date_partition_enabled = false
    include_op_for_full_load = true
    rfc_4180                 = false
    service_access_role_arn  = var.dms_s3_iam_role_arn
  }
}

resource "aws_dms_replication_task" "task" {
  replication_task_id = "fullload-and-cdc"

  migration_type            = "full-load-and-cdc"
  replication_instance_arn  = var.dms_replication_instance_arn
  replication_task_settings = var.fullload_cdc_task_settings

  source_endpoint_arn = aws_dms_endpoint.source.endpoint_arn
  target_endpoint_arn = aws_dms_endpoint.fullload.endpoint_arn

  table_mappings = var.mappings
}

Steps to Reproduce

Run terraform apply

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No response

@thaiphv thaiphv added bug Addresses a defect in current functionality. needs-triage Waiting for first response or review from a maintainer. labels Oct 18, 2022
@github-actions github-actions bot added the service/dms Issues and PRs that pertain to the dms service. label Oct 18, 2022
@github-actions
Copy link

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@thaiphv
Copy link
Author

thaiphv commented Oct 18, 2022

Just realised that the Terraform provider applied a whole lot of default settings to the endpoint:

{
    "ServiceAccessRoleArn": "<arn>",
    "ExternalTableDefinition": "",
    "CsvRowDelimiter": "\\n",
    "CsvDelimiter": ",",
    "BucketFolder": "<folder>",
    "BucketName": "<bucket>",
    "CompressionType": "NONE",
    "EncryptionMode": "SSE_S3",
    "ServerSideEncryptionKmsKeyId": "",
    "DataFormat": "csv",
    "EncodingType": "rle-dictionary",
    "DictPageSizeLimit": 1048576,
    "RowGroupLength": 10000,
    "DataPageSize": 1048576,
    "ParquetVersion": "parquet-1-0",
    "EnableStatistics": true,
    "IncludeOpForFullLoad": true,
    "CdcInsertsOnly": false,
    "TimestampColumnName": "",
    "ParquetTimestampInMillisecond": false,
    "CdcInsertsAndUpdates": false,
    "DatePartitionEnabled": false,
    "DatePartitionSequence": "yyyymmdd",
    "DatePartitionDelimiter": "slash",
    "UseCsvNoSupValue": false,
    "CsvNoSupValue": "",
    "PreserveTransactions": false,
    "CdcPath": "",
    "UseTaskStartTimeForFullLoadTimestamp": false,
    "CannedAclForObjects": "none",
    "AddColumnName": false,
    "CdcMaxBatchInterval": 60,
    "CdcMinFileSize": 32,
    "CsvNullValue": "NULL",
    "MaxFileSize": 1048576,
    "Rfc4180": false
}

TimestampColumnName was also set to "" even I didn't set it in the Terraform configuration.

@thaiphv
Copy link
Author

thaiphv commented Oct 18, 2022

Whereas the setting of an endpoint created by CloudFormation was a lot less verbose:

{
    "ServiceAccessRoleArn": "<arn>",
    "CsvRowDelimiter": "\\n",
    "CsvDelimiter": ",",
    "BucketFolder": "<folder>",
    "BucketName": "<bucket>",
    "CompressionType": "NONE",
    "EnableStatistics": true,
    "DatePartitionEnabled": true
}

@thaiphv
Copy link
Author

thaiphv commented Oct 18, 2022

I had a chat with the AWS support team and was told it looks like an issue with the way the aws_dms_endpoint resource is created. The Terraform provider shouldn't submit settings with default values to the API, particularly the "TimetampColumnName" setting. Even we didn't mean to set it but after TF created the resource, it also set "TimetampColumnName" to an empty string. And when we used it with a aws_dms_replication_task resource, the API complained that the "TimetampColumnName" setting of the endpoint must be non-empty.

@Mistawes
Copy link

Mistawes commented Oct 18, 2022

I've been using DMS fairly regularly, but hit this same issue today.

I tried defining the "timestamp_column_name" on the endpoint and setting TimestampColumnName=ts; on the endpoints "extra_connection_attributes" but no luck.

Then rolled back to the 4.34.0 provider and re-init'd and I'm getting the same error.. Which would suggest it's a change on AWS side as DMS was working fine for months.

@Mistawes
Copy link

Hmm I noticed, if I change from using an S3 target endpoint to using an oracle (same as source) it created the task fine.

So seems it's related to S3 target endpoints?

@thaiphv
Copy link
Author

thaiphv commented Oct 19, 2022

Hmm I noticed, if I change from using an S3 target endpoint to using an oracle (same as source) it created the task fine.

So seems it's related to S3 target endpoints?

I think so

@justinretzolk justinretzolk added good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. and removed needs-triage Waiting for first response or review from a maintainer. labels Nov 1, 2022
@ali-raza-rizvi
Copy link

I had a chat with the AWS support team and was told it looks like an issue with the way the aws_dms_endpoint resource is created. The Terraform provider shouldn't submit settings with default values to the API, particularly the "TimetampColumnName" setting. Even we didn't mean to set it but after TF created the resource, it also set "TimetampColumnName" to an empty string. And when we used it with a aws_dms_replication_task resource, the API complained that the "TimetampColumnName" setting of the endpoint must be non-empty.

Have you found any work around for it ?

@thaiphv
Copy link
Author

thaiphv commented Nov 1, 2022

I had a chat with the AWS support team and was told it looks like an issue with the way the aws_dms_endpoint resource is created. The Terraform provider shouldn't submit settings with default values to the API, particularly the "TimetampColumnName" setting. Even we didn't mean to set it but after TF created the resource, it also set "TimetampColumnName" to an empty string. And when we used it with a aws_dms_replication_task resource, the API complained that the "TimetampColumnName" setting of the endpoint must be non-empty.

Have you found any work around for it ?

Unfortunately, no. I abandoned my effort Terraforming the DMS pipelines.

@breathingdust breathingdust added the regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. label Nov 29, 2022
@YakDriver YakDriver self-assigned this Jan 4, 2023
@YakDriver
Copy link
Member

YakDriver commented Jan 4, 2023

The trouble is with the way aws_dms_endpoint includes default values. Removing default values is generally considered a breaking change. I don't see this as a regression but rather a challenge with the way we've always done it that is not working well with the way AWS would like it done.

Going forward, we can potentially remove these default values, technically a breaking change, relying on AWS for the defaults, and maybe get away with it. There is risk if someone's configuration has relied on the default values the AWS provider gives versus the values AWS would give. That risk needs to be weighed against the problems this is currently causing with the aws_dms_replication_task. (The aws_dms_s3_endpoint includes fewer default values but still seems to be having some problems with aws_dms_replication_task.)

See also #28130

@YakDriver YakDriver added breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. and removed regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. labels Jan 4, 2023
@YakDriver
Copy link
Member

After looking at this more, this is resolved by using the aws_dms_s3_endpoint resource with the aws_dms_replication_task resource. We apologize for the inconvenience of switching resources but hopefully it is better than waiting for v5 due to the fix requiring breaking changes. Please let us know if you have issues using aws_dms_s3_endpoint to accomplish your task.

@github-actions github-actions bot added this to the v4.50.0 milestone Jan 6, 2023
@github-actions
Copy link

This functionality has been released in v4.50.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@vvatlin
Copy link

vvatlin commented Jan 15, 2023

Still have the same issue

aws_dms_replication_task.this["cdc_ex"]: Creating...
╷
│ Error: error creating DMS Replication Task (adv2-s3-prod): InvalidParameterValueException: TimestampColumnName cannot be an empty string.
│ 	status code: 400, request id: 9cb1b36c-6e39-4a6b-898e-d065a92ca7ce
│
│   with aws_dms_replication_task.this["cdc_ex"],
│   on main.tf line 291, in resource "aws_dms_replication_task" "this":
│  291: resource "aws_dms_replication_task" "this" {
│

@vvatlin
Copy link

vvatlin commented Jan 15, 2023

@YakDriver aws provider 4.50.0

@DaanVandenreyt
Copy link

DaanVandenreyt commented Jan 26, 2023

I also ran into this issue today, but was able to fix it.
Using aws_dms_endpoint as a resource, you can add your own string for the timestamp column inside the s3_settings with the timestamp_column_name argument.

Example:

resource "aws_dms_endpoint" "your_s3_endpoint" {
  endpoint_id   = "your_s3_endpoint_id"
  endpoint_type = "target"
  engine_name = "s3"
  s3_settings {
    bucket_name   = "your_bucket_name"
    bucket_folder = "your_bucket_folder"
    service_access_role_arn = "iam_role_arn"
    timestamp_column_name = "your_timestamp_column"
  }

alexlopes added a commit to getninjas/terraform-aws-dms that referenced this issue Jan 30, 2023
We are adding a module for resource aws_dms_s3_endpoint as a alternative
to solve the error related to the issue below

hashicorp/terraform-provider-aws#27283

"aws_dms_replication_task failed with error InvalidParameterValueException:
TimestampColumnName cannot be an empty string."
alexlopes added a commit to getninjas/terraform-aws-dms that referenced this issue Jan 31, 2023
We are adding a module for resource aws_dms_s3_endpoint as a alternative
to solve the error related to the issue below

hashicorp/terraform-provider-aws#27283

"aws_dms_replication_task failed with error InvalidParameterValueException:
TimestampColumnName cannot be an empty string."
@alexlopes
Copy link

Using aws_dms_s3_endpoint solved the error here :) I created a new module specific for S3.

@DaanVandenreyt
Copy link

I saw and tried that resource as well, but had difficulties with its outputs. For some reason, when using aws_dms_s3_endpoint the endpoint_arn attribute wouldn't work for me. So that is why I used the regular one.

@github-actions
Copy link

github-actions bot commented Mar 3, 2023

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. bug Addresses a defect in current functionality. service/dms Issues and PRs that pertain to the dms service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants