Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETag mismatch on MinIO external dependency add. #3454

Closed
pared opened this issue Mar 6, 2020 · 29 comments
Closed

ETag mismatch on MinIO external dependency add. #3454

pared opened this issue Mar 6, 2020 · 29 comments
Labels
p3-nice-to-have It should be done this or next sprint

Comments

@pared
Copy link
Contributor

pared commented Mar 6, 2020

Assuming we have MinIO instance set up with two buckets (dvc-cache, data) on localhost:9000, and we try to add data from data bucket as external dependency we will get ETag mismatch error.

Example:

#!/bin/bash

rm -rf repo
mkdir repo

pushd repo
git init --quiet
dvc init -q

export AWS_ACCESS_KEY_ID="minioadmin"
export AWS_SECRET_ACCESS_KEY="minioadmin"  

dvc remote add s3cache s3://dvc-cache/cache
dvc config cache.s3 s3cache
dvc remote modify s3cache endpointurl http://localhost:9000
dvc remote modify s3cache use_ssl False

dvc remote add miniodata s3://data
dvc remote modify miniodata endpointurl http://localhost:9000
dvc remote modify miniodata use_ssl False

dvc add remote://miniodata/file

Will result with:

ERROR: ETag mismatch detected when copying file to cache! (expected:
 '4e102ec8d6ab714aae04d9e3c7b4c190-1', actual: 'ca9e5ed43f3fbee6edec
bb5ac6fba77e-1')

Related: #2629 , #3441

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Mar 6, 2020
@harshavardhana
Copy link

Seems to run fine for now me locally

~ /tmp/test.sh 
~/repo ~
100% Add|█████████████████████████████████████████████████████████████████████████████████████████████|1/1 [00:00,  1.11file/s]

To track the changes with git, run:

	git add xfs_metadump.dvc

Do you have an example of the dataset which we can reproduce this issue?

@harshavardhana
Copy link

@pared please try with minio server --compat

@harshavardhana
Copy link

https://github.com/minio/minio#caveats to understand more about why --compat might be needed here.

@benjamintanweihao
Copy link

benjamintanweihao commented Mar 9, 2020

Here's what I tried:

docker run -v $PWD:/data -p 9000:9000 minio/minio:RELEASE.2020-03-06T22-23-56Z server --compat /data

I still got a similar error:

dvc add remote://miniodata/shapes_128_128_20180914T1515
Adding...                                                                                                                                                                      
ERROR: ETag mismatch detected when copying file to cache! (expected: '7461e0bd8ff44f448d2659cc04f0f86a-1', actual: '098a1335754cabb3ec1ab403e4597be1-1')  

@harshavardhana
Copy link

Notice you need to purge the existing objects, dvc caches data.

@benjamintanweihao
Copy link

@harshavardhana @pared

Ok so here's what I did:

  1. I removed .minio.sys and upgraded to the latest one via docker pull.
  2. I removed the dvc-cache folder and created a new one
  3. I git cloned a fresh repository

Now I get a slightly different error, but again related to ETags:

(base) benjamintan@argus:~/workspace/dev/darkrai_2$ dvc add remote://miniodata/shapes_128_128_20180914T1515 -v
2020-03-10 10:53:10,292 DEBUG: PRAGMA user_version;                     
2020-03-10 10:53:10,293 DEBUG: fetched: [(3,)]
2020-03-10 10:53:10,293 DEBUG: CREATE TABLE IF NOT EXISTS state (inode INTEGER PRIMARY KEY, mtime TEXT NOT NULL, size TEXT NOT NULL, md5 TEXT NOT NULL, timestamp TEXT NOT NULL)
2020-03-10 10:53:10,293 DEBUG: CREATE TABLE IF NOT EXISTS state_info (count INTEGER)
2020-03-10 10:53:10,293 DEBUG: CREATE TABLE IF NOT EXISTS link_state (path TEXT PRIMARY KEY, inode INTEGER NOT NULL, mtime TEXT NOT NULL)
2020-03-10 10:53:10,293 DEBUG: INSERT OR IGNORE INTO state_info (count) SELECT 0 WHERE NOT EXISTS (SELECT * FROM state_info)
2020-03-10 10:53:10,293 DEBUG: PRAGMA user_version = 3;
2020-03-10 10:53:10,660 DEBUG: Uploading '../../../../../tmp/tmph3zuly_y' to 's3://dvc-cache/cache/.dSv6zZRrDrF8rrZG7EG7vQ.tmp'                                                
2020-03-10 10:53:10,801 DEBUG: cache 's3://dvc-cache/cache/33/d27895814e19a630b0dcfff3ca1a32.dir' expected '33d27895814e19a630b0dcfff3ca1a32.dir' actual 'None'                
2020-03-10 10:53:10,867 DEBUG: Removing s3://dvc-cache/cache/.dSv6zZRrDrF8rrZG7EG7vQ.tmp
2020-03-10 10:53:10,868 DEBUG: {'remote://miniodata/shapes_128_128_20180914T1515': 'modified'}
2020-03-10 10:53:10,921 DEBUG: Uploading '../../../../../tmp/tmpbqom4nbg' to 's3://dvc-cache/cache/.gFLeM9XfsEAdr9fAUq5CpF.tmp'
2020-03-10 10:53:10,972 DEBUG: cache 's3://dvc-cache/cache/33/d27895814e19a630b0dcfff3ca1a32.dir' expected '33d27895814e19a630b0dcfff3ca1a32.dir' actual '33d27895814e19a630b0dcfff3ca1a32'
2020-03-10 10:53:10,973 DEBUG: Computed stage 'shapes_128_128_20180914T1515.dvc' md5: 'e808412c5ebc930850a133d48bdde83f'
2020-03-10 10:53:11,022 DEBUG: Saving 's3://data/shapes_128_128_20180914T1515' to 's3://dvc-cache/cache/33/d27895814e19a630b0dcfff3ca1a32.dir'.
2020-03-10 10:53:11,035 DEBUG: cache 's3://dvc-cache/cache/00/000000000000000000000000000000-1' expected '00000000000000000000000000000000-1' actual 'None'
Adding...                                                                                                                                                                      
2020-03-10 10:53:11,158 DEBUG: SELECT count from state_info WHERE rowid=?                                                                                                      
2020-03-10 10:53:11,159 DEBUG: fetched: [(0,)]
2020-03-10 10:53:11,159 DEBUG: UPDATE state_info SET count = ? WHERE rowid = ?
2020-03-10 10:53:11,223 ERROR: ETag mismatch detected when copying file to cache! (expected: '00000000000000000000000000000000-1', actual: '098a1335754cabb3ec1ab403e4597be1-1')
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/command/add.py", line 20, in run
    fname=self.args.file,
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/repo/__init__.py", line 28, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/repo/scm_context.py", line 4, in run
    result = method(repo, *args, **kw)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/repo/add.py", line 86, in add
    stage.commit()
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/funcy/decorators.py", line 39, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/stage.py", line 161, in rwlocked
    return call()
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/funcy/decorators.py", line 60, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/stage.py", line 823, in commit
    out.commit()
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/output/base.py", line 241, in commit
    self.cache.save(self.path_info, self.info)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/base.py", line 520, in save
    self._save(path_info, checksum)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/base.py", line 526, in _save
    self._save_dir(path_info, checksum)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/base.py", line 479, in _save_dir
    self._save_file(entry_info, entry_checksum, save_link=False)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/base.py", line 427, in _save_file
    self.move(path_info, cache_info)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/base.py", line 659, in move
    self.copy(from_info, to_info)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/s3.py", line 188, in copy
    self._copy(self.s3, from_info, to_info, self.extra_args)
  File "/home/benjamintan/miniconda3/lib/python3.7/site-packages/dvc/remote/s3.py", line 184, in _copy
    raise ETagMismatchError(etag, cached_etag)
dvc.exceptions.ETagMismatchError: ETag mismatch detected when copying file to cache! (expected: '00000000000000000000000000000000-1', actual: '098a1335754cabb3ec1ab403e4597be1-1')

What's interesting is this particular line:

2020-03-10 10:53:11,035 DEBUG: cache 's3://dvc-cache/cache/00/000000000000000000000000000000-1' expected '00000000000000000000000000000000-1' actual 'None'

While on the other hand it seems like for other files the MD5/ETag was computed correctly (albeit with the --compat flag turned on).

@skshetry skshetry added p1-important Important, aka current backlog of things to do and removed triage Needs to be triaged labels Mar 10, 2020
@harshavardhana
Copy link

harshavardhana commented Mar 10, 2020

Yes because now that you purged .minio.sys you lost all metadata including the ETag for all your objects. Since dvc remembers ETag it is not possible to start server without --compat and start it again with --compat.

Server should be started with --compat on the get go, so when I meant fresh means fresh everything including the content on MinIO not just metadata.

@harshavardhana
Copy link

harshavardhana commented Mar 10, 2020

Unless of course a refresh of sort can be asked from dvc.

The original issue here is a bug in MinIO CopyObject which is not preserving the previous ETag which is not intentional.

This happens only without --compat flag so this needs to be fixed on MinIO. Once that is fixed i expect that dvc will work without --compat flag too.

Q: Does dvc require ETag to be checksum of the content or does it do this outside? What is the purpose of ETag is it like some opaque value for a blob for reference?

@benjamintanweihao
Copy link

benjamintanweihao commented Mar 10, 2020

@harshavardhana To be clear, I actually purged both MinIO and DVC metadata at the same time, then restarting the MinIO server. But you seem to be saying that I need to ensure that the content is new? For example copy it to another folder?

@harshavardhana
Copy link

Yes because you already have content, we don't compute new etags for existing data. Just purging .minio.sys simply looses all object metadata. It won't regenerate it magically.

@harshavardhana
Copy link

Also let's avoid lengthy discussion here, as a courtesy towards dvc maintainers.

You are not doing what I demonstrated anyways. Let's discuss on our slack instead @benjamintanweihao

@efiop efiop added p3-nice-to-have It should be done this or next sprint and removed p1-important Important, aka current backlog of things to do labels Mar 10, 2020
@efiop
Copy link
Contributor

efiop commented Mar 10, 2020

@harshavardhana Thanks a lot for the explanation! Looks like there is not much we can do on dvc side, so closing this issue for now.

@efiop efiop closed this as completed Mar 10, 2020
@shcheklein
Copy link
Member

@efiop does it make sense to report it back to minio?

@skshetry
Copy link
Member

@shcheklein, there's an issue which was closed as it was supposed to work as this on minio. See: minio/minio#8012 (comment)

We could however suggest user to use --compat or find a better way thanEtag.

@pared
Copy link
Contributor Author

pared commented Mar 10, 2020

The solution provided by @harshavardhana seems to be working, though I encounter another problem.

It seems that running MinIO server with --compat option helps:
docker run -p 9000:9000 minio/minio --compat server /data.

Next error I encounter is:
Invalid length for parameter Key, value: 0, valid range: 1-inf.
This problem does not happen when using s3 as remote.

@harshavardhana
Copy link

@pared that exception looks like client side perhaps a new issue to be reported, feel free to CC me if you think it's MinIO

@harshavardhana
Copy link

I think I found the behavior here @skshetry @efiop - I see the dvc is looking for a pattern in ETag such as -1 and trying to perform multipart - are you trying to ensure the preservation of ETag in some form? what is the ETag used for?

@efiop
Copy link
Contributor

efiop commented Mar 10, 2020

@harshavardhana Yes, we try to ensure that etag is the same after we copy an object from one place to another. It is needed to be able to identify that an object is unchanged no matter where it is located.

@harshavardhana
Copy link

harshavardhana commented Mar 10, 2020

Variadic parts doesn't seem to be handled, even the number of parts calculation based on ETag cannot know for certain what was the part size used. This is tricky, has there been a thought on fixing this by using something more appropriate for all usecases?

https://github.com/s3git/s3git might give some ideas

@efiop
Copy link
Contributor

efiop commented Mar 11, 2020

Yes, the assumption is that the part size stays the same. An important note is that this is only used for some very advanced functionality, so maybe we didn't have enough runs in the wild, but so far it worked fine.

Thanks for sharing s3git link! Looks like they don't rely on etags, but instead manually compute hash for objects on s3, right?

@pared
Copy link
Contributor Author

pared commented Mar 11, 2020

@harshavardhana thanks for sharing your insight!
I found an issue about this problem and it seems to be pointing to aws client.
aws/aws-cli#2929
I did not have time to dive deeper into that issue yet.

@harshavardhana
Copy link

Yes, the assumption is that the part size stays the same. An important note is that this is only used for some very advanced functionality, so maybe we didn't have enough runs in the wild, but so far it worked fine.

Thanks for sharing s3git link! Looks like they don't rely on etags, but instead manually compute hash for objects on s3, right?

Yes that is correct it, I shall cc @fwessels for more clarity.

@efiop
Copy link
Contributor

efiop commented Mar 11, 2020

@harshavardhana As far as I can tell, BLAKE2 is used, but even though it is ~x2 faster than md5, it still takes time to compute. That is something that we avoid by leveraging ETags as a free way to ensure that objects are unchanged. We have thought about computing some hash instead, but for now there wasn't enough need for that. But we might come to that in the future at which point we will probably take a look at BLAKE3 or something like that. Thanks for the question! 🙏

@fwessels
Copy link

For consistent hashing you should look into the verified streaming behaviour that BLAKE3 offers. This fixes the size of the individual parts for which hashes are computed at the lowest level which then ultimately results in the final hash at the top.

"Only" caveat here is that, when working with multipart objects, you need to know where the boundaries are for the parts to hash (and some hashes will need to be computed between two consecutive parts).

So it is definitely doable, but not completely trivial to implement (and dropping parts upon finalization of the multipart upload would not be allowed (at least not without triggering recomputing the hashes from the point onwards where a part is left out)).

@shcheklein
Copy link
Member

@harshavardhana

Variadic parts doesn't seem to be handled

it should be handled as far as I remember:

            obj = cls.get_head_object(
                s3, from_info.bucket, from_info.path, PartNumber=i
            )
            part_size = obj["ContentLength"]

so, we try to preserve the same number of parts and the same length of them.

It is still a gray area, since I'm not sure it's not officially documented how ETAG is calculated from multi-parts, but I think it is reasonable for our users to rely on that optimization for now. As @efiop mentioned we can introduce BLAKE2 os something similar if Amazon at some decides to change the logic behind it.


Btw, I wonder if s3.upload_part_copy moves bytes first to the local machine before sending then back to S3 or S3 has a way to copy remotely? If it pulls the data locally, we can potentially calculate the hash while we copy data w/o affecting performance. In this case we would need S3 to support rename at least.

@harshavardhana
Copy link

it should be handled as far as I remember:

No, it is not Parts can be uploaded in this manner

  • Part.1 - 5MiB
  • Part.2 - 6MiB
  • Part.3 - 1byte

This will result in ETag as md5hex(md5(5MiB) + md5(6MiB) + md5(1byte)-3

Now if you assume 3 parts content-length is 11MiB you have no idea what is the length used
for 1st part, 2nd part - if you happen to choose 5MiB for both then you will result with an incorrect
ETag which will mismatch. I can reproduce this right now with dvc using AWS S3. Of course
I assume that this is not handled because it is a corner case and rare. Just so that you are aware I am clarifying this a bit.

It is still a gray area, since I'm not sure it's not officially documented how ETAG is calculated from multi-parts, but I think it is reasonable for our users to rely on that optimization for now. As @efiop mentioned we can introduce BLAKE2 os something similar if Amazon at some decides to change the logic behind it.

multipart ETAG is nothing but the hexmd5(md5(part1) + md5(part2)...)-N this is documented not in AWS S3 docs but found while talking to AWS support.

Btw, I wonder if s3.upload_part_copy moves bytes first to the local machine before sending then back to S3 or S3 has a way to copy remotely? If it pulls the data locally, we can potentially calculate the hash while we copy data w/o affecting performance. In this case we would need S3 to support rename at least.

The server-side copy of parts is called CopyObjectPart() - which I see that you are using when you see ETag as a - at the end.

NOTE: This assumption will also fail for SSE-C encrypted objects as well because AWS S3 doesn't return a proper ETag when you have SSE-C encrypted objects - meaning an SSE-C object will change its ETag automatically upon an overwrite.

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html

The entity tag is a hash of the object. The ETag reflects changes only to the contents 
of an object, not its metadata. The ETag may or may not be an MD5 digest of the
object data. Whether or not it depends on how the object was created and how it 
is encrypted as described below:

- Objects created by the PUT Object, POST Object, or Copy operation, or through the 
AWS Management Console, and are encrypted by SSE-S3 or plaintext, have ETags 
that are an MD5 digest of their object data.

- Objects created by the PUT Object, POST Object, or Copy operation, or through the 
AWS Management Console, and are encrypted by SSE-C or SSE-KMS, have 
ETags that are not an MD5 digest of their object data.

- If an object is created by either the Multipart Upload or Part Copy operation, 
the ETag is not an MD5 digest, regardless of the method of encryption.

@shcheklein
Copy link
Member

shcheklein commented Mar 12, 2020

@harshavardhana

if you happen to choose 5MiB for both then you will result with an incorrect
ETag which will mismatch

I think that's not how DVC does this. Please, check this code - https://github.com/iterative/dvc/blob/master/dvc/remote/s3.py#L103-L107

Unless I'm missing something it takes exact sizes of all parts.

I can reproduce this right now with dvc using AWS S3

could you share the script, please? It might be a bug.

multipart ETAG is nothing but the hexmd5(md5(part1) + md5(part2)...)-N this is documented not in AWS S3 docs but found while talking to AWS support.

Yep, I know. It doesn't make it official - thus "grey" area. Though, like I mentioned it's very unlikely they change it, so we made a decision to utilize this behavior.

NOTE: This assumption will also fail for SSE-C encrypted objects as well because AWS S3 doesn't return a proper ETag when you have SSE-C encrypted objects - meaning an SSE-C object will change its ETag automatically upon an overwrite.

Yes. See #2701 (comment) and iterative/dvc.org#774

@harshavardhana
Copy link

I think that's not how DVC does this. Please, check this code - https://github.com/iterative/dvc/blob/master/dvc/remote/s3.py#L103-L107

Unless I'm missing something it takes exact sizes of all parts.

Ah, you are using the PartNumber API @shcheklein which was undocumented for a few years and then has silently appeared.

        for i in range(1, n_parts + 1):
            obj = cls.get_head_object(
                s3, from_info.bucket, from_info.path, PartNumber=i
            )
            part_size = obj["ContentLength"]
            lastbyte = byte_position + part_size - 1
            if lastbyte > size:
                lastbyte = size - 1

We have had debates about should we ever implement this on MinIO, but looks like since you guys use it - it makes sense to do it.

@shcheklein
Copy link
Member

@harshavardhana thanks! 🙏 Will there be a ticket for us to follow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p3-nice-to-have It should be done this or next sprint
Projects
None yet
Development

No branches or pull requests

7 participants