-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Private Fields #9909
Comments
I feel the argument you make in the objections ("In addition, Zig is increasingly making legal but potentially buggy patterns into hard errors, somtimes controversially (e.g. unused variables).") is a very good one. This is something that I feel Zig has evolved in since #2059. Maybe worth moving that argument up to the introduction, to pay it more attention? A possible objection is that this could also be achieved with a strict naming convention, e.g. starting variables with an underscore will mark them as if they're private, and perhaps be enforced with future linter rules or the like. Just a thought. |
To support private fields, from
|
As for documentation value, in status quo I would suggest: pub const ByteList = struct {
// public decls and members here
data: []u8,
private: ByteListPrivate,
};
/// non-public, private decls and members here
const ByteListPrivate = struct {
allocator: std.mem.Allocator,
capacity: usize,
}; Accessing fields in I'll probably have to reread the The missing piece then is "making the compiler complain" when overwriting the |
This might be really useful, it doesn'make sense to me why access is a concern for decls, but not fields. It should be either both or neither. Just an opinion: As for setter/getter boilerplate, half of a problem can possibly be solved with something like 'readonly' keyword, which would effectively act as '*const' outside of a file scope. |
@rohlem Also, and somewhat ironically, the OpaqueHandle trick only works because it leverages the existing ability to declare a private struct. Otherwise there would be very little point in the whole concept of OpaqueHandle, and you might just as well go with a naming convention like |
I have been using naming conventions for my private fields, like |
Personally, I quite like that data in zig is so transparent; it makes no attempt to model anything other than what the computer is really doing. That said, there is obviously a use case for data hiding, but I don't think it should be as easy as just marking a field as private; and as well, I think this shouldn't be as powerful as the privatization of declarations, and moreso act as stronger documentation than comments. Essentially, I think we should continue to encourage transparent data, by making private fields harder to declare in some way. In that regard, I actually think something similar to what was modelled in #9859 would actually be better for this purpose - using a separate "opaque bag of bits" as a section for private data. This would not only slightly discourage private data, but also document even better the separation of private and public fields. I think there is perhaps also an argument to be made that, if private fields, whether in the proposed style or in the #9859 style, maybe some form of controlled access into private fields should be enabled anyway, like a built-in (e.g. Just some food for thought. I'll stop typing now. |
So do I, and in private projects I don't usually bother with access restrictions, even in other languages. But is this sufficient reason to make data hiding inconvenient or impossible in places where it is needed (e.g. public-facing libraries or high-assurance projects)?
I'll have to disagree here. To me, As for adding a |
@zzyxyzz As it pertains to private declarations, I think the most convincing rationale I've heard for it is that it forces change at an API level, as opposed to forcing it at the library-usage level; that is to say, if a library has marked a certain function as private, but the community at large has decided that it would be best marked public, this will either force the library implementers to push the change, or force a fork of the library that implements this change (with the former being the preferred outcome). This as opposed to forcing people wrapping said library to use functions that don't have the same guarantees as those explicitly marked public. I can't quite recall where I read that here, or whether or not that's a good enough reason. But for the purposes of arguing against applying this logic it to private fields, at the very least, I would point to the fact that although a field may be marked private, it cannot be guaranteed to be inaccessible, where its memory is accessible in the public-facing API (as explained), and thus it makes sense to treat field access modifiers more like built-in documentation tools, as opposed to strict API boundary controls. (Please tell me if any of this makes sense, my synthesizing skills are not pristine). |
I've heard this argument too, but I find it unrealistically idealistic. Even assuming that your proposed change will be accepted into the public API (which will not happen most of the time, either because your use-case is too specific or the authors plain disagree with your way of doing things), the usual timeline for the change is measured in months or years. And outright forks rarely happen at all, unless the project is abandoned by the previous author. So, while this position does represent best practice for upstreaming changes in open-source projects, it has little to no relevance for everyday software development. |
I am glad that we generally agree, then. I'm just bringing the argument to the forefront. |
If the |
@sina-devel I would argue that it makes sense for functions to private by default, as it incentivizes explicitly building a robust public-facing API, as opposed to simply allowing everything to spill into global scope. Whereas, here, I think it is more useful to overall make transparent data the default, and incentivize having to explicitly decide if data should really be marked as an "internal" component (which would likely lead to more boilerplate related to reading and writing to the data, which should certainly not be the default, in my opinion, coming from a C++ background). |
adding
right now:
by default private and exported by
adding keyword to readonly at outside and mutable at inside:
|
Get/set function are not exclusively a bad thing, sometimes they are needed to accept data from a user in a different way than it is stored internally, and sometimes you really need to keep different fields in sync. For example, library can provide a chunk of data, that user can store in whatever way they need, where it's still makes sense to keep some non-const, non-user-defined cache: const Texture = struct {
renderer_id: u32, // not appropriate for user to change
width: usize,
height: usize,
pixel_format: enum { R8_G8_B8_A8, Grayscale },
filter: enum { Nearest, Bilinear },
mipmap_levels: usize, // should be in sync with renderer
pub fn load(path: []const u8) @This() { ... }
pub fn unload() void { ... }
pub fn genMipmaps(self: *@This(), mipmaps: usize) void { ... }
}; In this example, you don't ever want to modify some fields without syncing state with the renderer, but may need to pass container around freely, and allow operations on an instance of a struct in member functions. Argument that fields shouldn't even be allowed to be hidden steers all libraries to be enclosed state machines, limited to hiding data by using top-level declarations. // value_container.zig
var value: usize = 0; const value_container = @import("value_container.zig");
test "acces from a different file" {
value_container.value = 3;
} Output of this test:
I guess what i am trying to say, is that acces control concerns are equally valid for functionality and data, and should, in my opinion, be consistent. Either it's "everything is transparent, all the time", or access control is a useful thing for both. |
Maybe a way to alleviate some of boilerplate, is to allow reading from private fields? const Container = struct {
value: usize, // field without any specifiers is private by default
};
var c: Container = .{ .value = 0 };
var read = c.value; // reading from private fields is allowed
c.value = 42; // writing to a private field not allowed
const Container2 = struct {
pub value: usize, // and here is a way to make it writable
}; |
@di-ant, |
How would this interact with comptime reflection? |
@andrewrk, Since this is something that Zig does differently from other languages, it may come up again in the future. So it might be good to have a findable "official position" on this issue. |
Here is my reasoning: There are many properties that can be "leaked" into the public interface of an API. For example: size/alignment of a struct, performance characteristics of a given function implementation, whether a function can be executed at comptime, existence/non-existence of declarations, and more. It is up to the documentation of a given package to specify what exactly the major version is protecting - that is - what exactly counts as "breaking" and what does not. Whether a given field is part of the public API or not is one more item in this list. Of course most of these have general recommendations for defaults:
The idea of private fields and getter/setter methods was popularized by Java, but it is an anti-pattern. Fields are there; they exist. They are the data that underpins any abstraction. My recommendation is to name fields carefully and leave them as part of the public API, carefully documenting what they do. A good example is a field protected by a mutex: const Foo = struct {
/// This field must be accessed only when `mutex` is held.
unprotected_counter: usize,
mutex: Mutex,
}; In this case the field name pub fn getCounter(foo: *Foo) usize {
foo.mutex.lock();
defer foo.mutex.unlock();
return foo.unprotected_counter;
} Then if the API user held the mutex for some other reason, In my subjective experience, public fields generally lead to better abstractions by eliminating the temptation to attempt full encapsulation, when the more effective strategy is to provide composable abstraction layers. The bottom line here is that private fields is an addition to the language, making it more complicated and requiring the language to answer a bunch of questions that otherwise would not exist. Not only does the feature does not provide enough value to merit the language complexity, I contend that it actively leads to worse source code. |
Thank you for the detailed explanation! I can't say I agree with it, but it's certainly good to know where things stand. |
@andrewrk I understand your reasoning and agree with it. On the other hand, it's not clear is it safe (is this intended / allowed) to access struct fields directly, as they can be unprotected. A would propose to have private/public fields, so they can't be accessed accidentally. const obj = struct {
const Self = @This();
// private by default or have # syntax as modern JavaScript proposal
#length: usize,
pub fn length(self: *Self) {
return self.#length;
}
}
// Know what I'm doing!
const a = obj {};
const len = a.!length;
// or like this
const len = a.#length; |
@likern |
As a reminder this issue is closed |
I know this issue is closed, but just in case it is ever reconsidered, I'd like to post this reply to Andrew's comments. This has been eating at me for a while, so here goes:
Yes. But lets not throw the baby out with the bathwater. Just because there will always be things (some of them quite esoteric) that cannot be formally expressed in the interface, that should not necessarily prevent us from picking the low-hanging fruit. Field access control is a simple and well-established technique, not unlike the distinction between
I would not say that getter-setters are per se an anti-pattern. Their excessive use in some languages certainly is, but when there is a legitimate need for an encapsulated interface, they're just the thing. And it's not like Java is the only language with private fields. Almost all popular languages have them. I did not have the impression that Rust, for example, was suffering from a getter-setter epidemic.
This will work well sometimes, and sometimes it won't. I'm pretty sure that many library authors would prefer to hide some parts of the implementation, rather than carefully arranging things such that there is nothing to hide.
I note that the building of additional abstractions is just as likely to be prevented by low-level methods being private. Does this mean we should remove visibility controls etirely and rely on careful naming and documentation for methods and other decls as well? Besides, anticipated disagreements between library authors and users about what should or shouldn't be hidden can also be resolved with an explicit override (e.g.
So the fields should be public in this particular case. It does not follow that struct fields in general should always be public. Plenty of valid examples can be presented on both sides of the argument.
Arguably, adding private fields would simplify the language, at least from the user perspective, since it makes the language more consistent and more in line with patterns familiar from other languages. In my view Andrew makes a good case for why open interfaces should be preferred in general. He does not make a convincing case for why they should be forced to always be open. Projects differ. Requirements differ. Sometimes what you need is an open library, at other times it's a piece of high-assurance code that is audited forwards and backwards. Neither is right or wrong, they're just different, and I see little reason to pick one over the other when we can easily support both. |
I am a beginner at Zig, but I come from a background in many other languages. Perhaps my experience is wholly irrelevant to Zig, but I wanted to share my perspective as it seems a bit different from @andrewrk. I’ve noticed private fields have 3 major benefits for library authors, consumers, and the ecosystem as a whole:
I do not believe we will have Go’s level of stability in the Zig ecosystem without giving library authors a standard way to say, “this is what I promise will remain stable.” Other things are needed as well, of course, but this seems to be an important piece. Zig already has default private behavior with requiring authors to make things pub, so pub fields seem like such a natural inclusion from a beginner Zig-user’s POV. |
I don't think zig is a language that will necessarily do something "just because other languages do it". As someone who used to use languages which have private fields, and who now uses one - not zig - that does not.... It is absolutely amazing. Specifically, it makes debugging very lucid. I don't have to drop to a debugger for 99% of the things I need to do. Fields should absolutely NOT be hidden from access (because otherwise if you want to introspect them, you would need privileged code, i.e. std.debug.print "wouldn't just work"). However, I believe there is a use case for fields to not be assignable values outside of their own scope. It is a common thing in the language that I use that curious programmers dive into what is a "private api" and on a version change things get weird. |
I think it's worth saying, to rebutt the reasoning about docs as a suitable solution for private fields, that docs are always the WORST way of communicating code contracts to other developers, the best being language features which enforce certain behavior (types, mutability modifiers, access modifiers, etc.), the second best being widely recognized conventions. Java has the "private" field language feature, Python has an underscore prefix code convention. Both of these are extremely valuable for communicating the intended use of a struct or library. It's valuable not just to the developer who may or may not (probably not) be fully reading the docs of the library they're using, but also to any code reviewers to show that a library is being used in an unintended way. It's not hard to imagine a new developer modifying the "capacity" field of a struct because intellisense suggested it first over "setCapacity" (which does additional, necessary work), and it getting missed during code reviews since there's no indication at all that this is a misuse of the class unless you comb over the library's documentation. Not introducing ANY concept of "You probably don't want to call/write this function/field" is a huge mistake. If Zig wants to strive for being a maintainable language used by a team of more than a half dozen developers, then it MUST introduce either a language feature or code convention to communicate what is meant to be public API and what is not. It doesn't matter if it's Java-style private fields, or a Python-style naming convention, or something inbetween, any of these are adequate for communicating an unusual use of a library during development and code reviews. Docs, and expecting only developers of a certain rigor will use your language, are simply not an acceptable substitute for this kind of important & easily communicable information. |
I still think a private field is a good idea, at least building some conventions like prefixing the field with an underscore helps a lot. I understand people want to keep Zig concise and small, but it is very hard nowadays to tell what constitute the public API of a type by skimming through the code. |
While it totally makes sense not to over-complicate the language, and just to play devil's advocate here. :)... If we take the assumption that self-documenting code is a good thing, then I think it's safe to assume that if the intent is to be C evolved then it's ok to introduce a couple new concepts. Not need to go all C++/Rust on it and ruin the vibe though when we be jamming. So, if keeping is simple is the way, then perhaps we can make the secondary option "official" where we rely on naming conventions. Like with an underscore decorating a "private" member, as we did with JavaScript to fake it before it got that abstraction. It would be akin to the "official" style guides and solve the self-documenting concept while also not adding new abstractions if someone really does wanna shoot themselves in the foot by messing with it. |
Hi devs, Without official support for private fields and methods, we can adopt a naming convention to indicate that certain fields and methods should be treated as private. By prefixing field and method names with an underscore const obj = struct {
const Self = @This();
_length: usize, // private field
fn length(self: *Self) usize {
return self._calc();
}
fn _calc(self: *Self) usize { // private method
return self._length;
}
};
In this example, both By adopting this naming convention, we can achieve a level of encapsulation and communicate the intended usage of fields and methods, even without official language support for private elements. |
Introduction
Currently, function and variable declarations are private by default and are made externally visible with the
pub
modifier. However, data fields are always public and there is no way to restrict their visibility. Apparently, field access control was never really working in Zig (#569) and was officially removed in #2059. Which is somewhat surprising, since no other modern language I can think of (Java, C++, C#, D, Rust, Go, ...) fails to provide this feature.The lack of private data makes it impossible to do proper encapsulation / implementation hiding, which is widely considered to be a basic software engineering technique. In particular:
.capacity
of anArrayList
is overwritten.Possible objections
No official reason was given in #2059, but two possible objections to private fields come to mind:
Concerning 1, I'd say that protection does not have to be perfect to be useful. Preventing accidental and semi-deliberate messing is valuable by itself, not to mention the documentation value.
2 may be a real concern, or maybe not. The same argument can be made about private decls, but a proposal to override the visibility of methods at the call site (#8779) was recently met with a resounding rejection -- even though there are certainly cases where it is both reasonable and safe to call private methods. In addition, Zig is increasingly making legal but potentially buggy patterns into hard errors, somtimes controversially (e.g. unused variables). The lack of basic implementation hiding is not consistent with this safety-first attitude, IMHO.
Syntax
The simplest option is to adopt the same default as with decls: file-level private by default and externally visible with
pub
. If this clashes with data-oriented style, the opposite default can be chosen, but that wold require the introduction of aprivate
keyword. Yet another possibility is to make field visibility struct-level, e.g. withopaque struct {...}
.I don't really have a strong opinion here.
Open questions
Update 1: As a side benefit, it may be possible to remove the
opaque {}
type from the language, since it is rarely used and can be simulated withstruct { private ptr: usize }
.Update 2: I wouldn't mind adding an escape hatch like
@privateField(object, "name")
for cases where you need to use a particular library, but find the API too locked-down. Some discussion of this is here.The text was updated successfully, but these errors were encountered: