-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganize base/string.jl, base/utf*, test/strings.jl, test/unicode.jl #11925
Conversation
The fact that things are in subfolders now is a little bit nicer than #11908, but I still don't think anyone is going to like the number of small files here. Base generally isn't organized as a single type per file. How about combining the various special string types together, into
Given how git works, I really don't see how this is the case. Yes you get silly little conflicts at the end of a file sometimes, those are the easy ones and really not an issue. Rebasing PR's uses up a bit of CI time, but it also tests things in a closer merge state than when they were originally opened. Adding more files seems to me that it would be more likely someone would modify the ends of multiple files, causing this to happen more often, not less. |
This is better than the previous version. Unrelated to this PR, but honestly I'd rather remove RepString, RevString, and RopeString. I don't think we need utf/types.jl. Since we have utf16.jl and utf32.jl, we might as well put the corresponding type definitions in those files. However I wasn't paying attention when that file was introduced so I might be missing something. Also looks like escape.jl and strio.jl could be combined. |
@tkelman About #11908, I was in the process of moving things to folders when it was closed on me (I'd already moved the test files, see last commit in #11908). That hasn't been my experience with git. @JeffBezanson "better than the previous version" this is exactly what I was just about to push to the "previous version" when it got closed, and since I'm still fairly new with git and github, recreating it in a new branch ended up wasting a lot of my time and getting me frustrated. Frustration has passed, just want to move on now. You're correct about |
@JeffBezanson OT about Rep/Rev/Rope, I'd already been bugging @tkelman about getting rid of Rope (it's not used anywhere, not in julia, not in any registered package, same thing is true of Rep). |
This is because many of your proposed changes have been too big and contentious. The likelihood of needing to rebase to fix a conflict in an open PR is proportional to how many lines of code that PR touches (and it's an increasing function of the amount of time the PR is open), and has almost nothing to do with the number of files. This reorganization will not help reduce the number of conflicts in any way. |
@tkelman I've had that problem with rebasing even with small changes, it hasn't just been the children of #11004. The problem usually has been just with the way |
So what's the general nature of the conflicts you've needed to resolve then? I'm not opposed to reorganization, but we're going to need to find somewhere between master and a dozen small files. I know this sounds arbitrary, but can we find a way to do this while only adding around 5-6 new files to base? |
To clarify, by 5-6 new files I mean net additional files under base. Reorganizing the tests so the structure matches that in base is also good, I'm mostly focusing on converging to an agreeable structure of base here then the test changes can fall out of that. |
Scott, there are 55 other collaborators for JuliaLang that more or less have the ability to review and accept pull requests. When a PR is submitted that receives immediate feedback that it's not the right direction, and your response is a lengthy paragraph continuing to argue, it becomes a waste of everyone's time (particularly when dealing with more stylistic decisions where it's hard to be "objectively better"). You made no indication you were considering the suggested feedback, hence why the PR was closed. Note that you can continue to work on a branch, even if a PR is closed and it can be reopened later. |
Just stupid stuff, that perforce would have handled automatically, I'm not sure why git doesn't. |
@quinnj One paragraph explaining why I'd structured things as I did, after only 4 comments from contributors, 1 that was just a comment about MATLAB, without any suggestions as what to do, and 1 was positive about the direction (just saying that the number was too much, which I was addressing when you closed it). Giving somebody busy at least a day or two to respond would have been courteous. |
@ScottPJones much of the continuing friction is due to your responses to feedback. A simple "I have another change coming where I'm going to modify A,B,C, hold on" would have sufficed as opposed to #11908 (comment). Given the feedback you had received up to that point, I don't think you were going to convince anyone to like the level of granularity you had gone to. |
@tkelman I do think I've been very responsive to any constructive suggestions. I push back to make sure that the suggestion really is OK, esp. if it goes directly against my own experience. |
Also, up til the point of my last commit (which, because of some screwup on my part due to my newness with git, and needing to rebase because of fixes in master that had prevented me from running my tests, had lost half of the change, where I moved the base files into the strings folder and merged a few, the PR was clearly marked as WIP, which means I'm still responding to the review suggestions) |
@tkelman In the comment you mentioned, I tried to explain my reasoning behind the way I was organizing things. I even implied that I was going to change things further (but people had complained about doing more than one thing in a PR, so I said that it wasn't in that particular PR). Is there anything else that needs to be done to this? (@ninjin is also working on coverage for different parts of strings, so it would be good to get this in sooner rather than later, to avoid extra work from either of us) |
Yes, this needs to not add a dozen new small files to base. 5 or 6 would probably be okay. |
+1 |
What change are you looking at? This adds just 6 files in the folder strings, basic, search, parse, strutil, strio, and cstring. Except for cstring, they are all fairly large, hundreds of lines. cstring could probably go into basic, even though logically it didn't seem to belong. |
I think that if you combine the RopeString, RevString and RepString stuff into a single file named stringtypes.jl or something like that (on both the implementation and testing sides), then this would be ok. The logical separation is looking pretty reasonable at this point. |
What do you think about cstring (which is the only small one in the strings folder)? |
Merging RopeString, RepString on the testing side is easy, there is no testing currently! |
I'd also prefer |
If we get rid of one of those types, we'll probably get rid of all of them at the same time.
+1 |
OK, will rename from utf to unicode as well. |
Is there any way to inform git that I'm merging a file into another? I think part of the difficulty of reviewing this refactoring is that it doesn't seem to handle merging or splitting nicely, but maybe I'm not using git correctly. Thanks! |
No, the git model is that history is a series of tree snapshots – any notion of code motion is purely heuristic and only used for presentation or efficient storage. |
Ugh... I do miss perforce sometimes! |
+1 for removing rope and rep
|
Really, just kill them? 😀 |
OK, I have things all set to push with RopeString added back (now in stringtypes.jl (base & test)), but I'm waiting to see if the Appveyor and Travis-CI runs that are already in progress finish. |
OK, all tests passed, I'm pushing the version with RopeString added back (which passed locally [there is only a single line of testing for RopeString]). There are no substantive changes in this PR, it's just reorganization to make things more consistent and make life easier for those of us trying to increase testing coverage for strings. |
I agree that the |
It no longer removes anything, this is now strictly a cleaner restructuring of the files to help us split up the string coverage work. |
A few comments (this is shaping up nicely):
|
+1 to everything @quinnj said |
Also will need a squash. |
Into a single commit? I'd been told to wait to do that until a PR was ready to be merged, if you think it's ready now, I'll squash however you want. |
It appears from the discussion that this is getting closer, so perhaps it would be a good time to do so. |
OK, will do. |
OK, all tests passed, and I think I've done everything people have asked for. Thanks for reviewing! |
LGTM. |
Looks like this needs a rebase, then LGTM |
The monolithic string.jl has been split up into several files, and the test files in strings.jl and unicode.jl have been made to correspond with the files of the same names in base. This will prevent a lot of manual merging that was previously necessary. Merge Sub/Rev/Rep/RopeStrings into strings/types.jl, for base and test
All rebased, squashed to a single commit, passing all tests, 🎉 Thanks again to all reviewers, it's greatly appreciated! |
pointer{T<:ByteString}(x::SubString{T}, i::Integer) = pointer(x.string.data) + x.offset + (i-1) | ||
pointer(x::Union{UTF16String,UTF32String}, i::Integer) = pointer(x)+(i-1)*sizeof(eltype(x.data)) | ||
pointer{T<:Union{UTF16String,UTF32String}}(x::SubString{T}) = pointer(x.string.data) + x.offset*sizeof(eltype(x.data)) | ||
pointer{T<:Union{UTF16String,UTF32String}}(x::SubString{T}, i::Integer) = pointer(x.string.data) + (x.offset + (i-1))*sizeof(eltype(x.data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pointer
methods should be in base/pointer.jl
. It seems a little obscure to put some of these methods in utf32.jl
. Doesn't need to hold up the PR here, but a change we should probably make.
Reorganize base/string.jl, base/utf*, test/strings.jl, test/unicode.jl
The merge commit here froze during bootstrap https://ci.appveyor.com/project/StefanKarpinski/julia/build/1.0.6734/job/41e1ujwp78e22yaw the PR build passed, at e3b1828 - not that many commits between the two |
On my latest Coveralls build the string coverage is looking really great. Awesome work, @ScottPJones. |
This PR clobbered #11898, possibly other PRs that touched the string files. |
would've been ideal if there were a script or some other mapping to actually look at the diff piece by piece and really verify that this was strictly organizational at the time of merging |
A Google search leads to this and similar ones which suggests that Edit: although I just give it a try and seems that it only work on file level (so splitting files won't work). |
The monolithic string.jl has been split up into several files,
and the test files in strings.jl and unicode.jl have been made
to correspond with the files of the same names in base.
This will prevent a lot of manual merging that was previously necessary.
This is a preface to trying to get the string coverage to 100%
The only changes are splitting the files up, placing them into folders, and adding a top file that
includes the files in the folder.