-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
default.path.data included as a data path when path.data configured as yaml array #23981
Comments
I know what is happening here and I see a fix, I will open a PR soon. This is an incredibly unfortunate consequence of the fact that when a setting is specified as an array, we break it down into new keys |
This is a horrifically bad bug, it appears to have been introduced in 5.3.0, and it means that shards can end up in
It gets worse. Let's say that we fix this bug in 5.3.1 and a user migrates from 5.3.0 not aware of the fiasco that is ensuing here. They start up Elasticsearch on 5.3.1, and now their shards are missing! If they're not careful, they are in data loss territory. |
😭 |
If we just fix the config issue and someone upgrades one node at a time they shouldn't lose any data so long as they have replicas. The entire shard will have to recover from the primary, but that is a fairly normal thing. They'd have to manually clean up the shards in accidental data directory. Which isn't horrible, but it isn't great at all. We could do more to try and copy the data or make tool or something. Minimally anyone on 5.3 will want to be able to know whether they are effected by this issue before they upgrade so they know what they have to do as part of the upgrade. |
That's the problem, we do not have complete control over this.
Yes, if they are aware of the problem and do not panic. Or they do not do something like practice immutable infrastructure.
Yes, but it's quite delicate.
Indeed, and therein lies the rub. |
Yeah, I just wanted to start talking about this. I certainly don't claim to have answers. I guess I'm fairly sure that the person doing the upgrade is going to want to know if they are effected by this. Lots of people won't and 5.3.0 will just be a normal version for them. For the effected people, even if we handle the whole thing automatically, they are going to need to know. |
Indeed, we need more people thinking about this. I discussed some of the complexities with @dakrone earlier in the evening via another channel, and I've been thinking about it all night. Sorry that I haven't taken the time to write my thoughts down yet. Another complexity that I thought of is that the fix for this has to live for the rest of the 5.x series lifecycle because someone can upgrade from 5.3.0 to any future release in the 5.x series. And yet another complexity that I thought is a situation where someone has their
I completely agree. |
Do we have set limitations yet on rolling upgrades to 6.0? Would 5.3.0 ->
6.0 be allowed? If so, the fix has to carry over there as well.
…On Fri, Apr 7, 2017, 8:58 PM Jason Tedor ***@***.***> wrote:
Yeah, I just wanted to start talking about this.
Indeed, I need more people to talk to about this. I discussed some of the
complexities with @dakrone <https://github.com/dakrone> earlier in the
evening via another channel, and I've been thinking about it all night.
Sorry that I haven't taken the time to write my thoughts down yet.
Another complexity that I thought of is that the fix for this has to live
for the rest of the 5.x series lifecycle because someone can upgrade from
5.3.0 to any future release in the 5.x series.
And yet another complexity that I thought is a situation where someone has
their default.path.data included as one of the values in path.data
(assuming multiple path.data).
For the effected people, even if we handle the whole thing automatically,
they are going to need to know.
I completely agree.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#23981 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB1ZYHwbhJ9d1nXtq3sn8vTdcT_x87O0ks5rtumtgaJpZM4M3P2Y>
.
|
No, this is absolutely not allowed. You can only rolling restart upgrade from the latest 5.x that is released when 6.0.0 goes GA. |
We'll make sure there is a 5.4 and so that doesn't have to happen.
So any detection we have will produce a false positive for them, right? |
In fact, full cluster shutdown major version upgrades are even MORE dangerous in this instance. Because all of the shard disappear at the same time. |
I think it has to be smart enough to not produce a false positive for them. |
Just because a 5.4 exists, doesn't mean people will be there when they upgrade. |
Indeed, that means that we need a fix in 6.x as well, someone could full cluster restart upgrade from 5.3.0. |
Right, sorry. They won't be able to do a rolling upgrade. Just a full cluster restart upgrade. |
I think we should look into removing the default option here. It's too much magic IMO and we can give people a good error message in that case. We can even advice how to copy over stuff to the actual data directories OR do it ourself. We had code for this in 2.x that we can reuse when we moved away from rolling our own software raid. |
Today this is used to support the packaging. I think that we can remove this if we instead ship the elasticsearch.yml with the distribution packaging to include explicit values path.data, path.logs, and path.conf. Sadly, I think that as a result of the situation here, we can not do this until 7.0.0. |
that is a critical issue, I think we have to find a way to either fix it reliably or we need to drop stuff and I am not hesitating to even do this in a minor if it means dataloss or anything along those lines. |
Adds a packaging test that would have detected #23981. The test configures path.data and validates that nothing ends up in default.path.data.
Adds a packaging test that would have detected #23981. The test configures path.data and validates that nothing ends up in default.path.data.
Adds a packaging test that would have detected #23981. The test configures path.data and validates that nothing ends up in default.path.data.
Adds a packaging test that would have detected #23981. The test configures path.data and validates that nothing ends up in default.path.data.
Elasticsearch version: 5.3.0
OS version: Tried in Centos 7 and OS X, though I suspect it doesn't matter.
Description of the problem including expected versus actual behavior:
When using multiple data paths, or even specifying just a single data path as an array in yaml, elasticearch includes whatever is passed to it as default.path.data, such as when installed via RPM or DEB packages, from the systemd unit OR the init.d script, since you set this when you execute the es binary:
-Edefault.path.data=${DATA_DIR}
Steps to reproduce:
Repro:
Set this in the elasticsearch.yml
run this:
bin/elasticsearch -Edefault.path.data="/some/datapath2"
And elasticsearch will start configured with both data paths.
Provide logs (if relevant):
This also happens when you use this array format:
path.data: ["/some/datapath1"]
The text was updated successfully, but these errors were encountered: