You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue was originally created at: 2002-04-07 13:05:02.
This issue was reported by: nyber.
nyber said at 2002-04-07 13:05:02
I would like to request having an option of not taking source code comments into account when computing the md5 checksum.
I have a cvs log comment block in my source files that gets added to every time I do a cvs checkin. So after a checkin scons will rebuild all of those files because the md5 checksum changes. But all that is modified is a comment block at the top of the file.
---The following is Anthony Roach's comments on how this could be done.
That would be a pretty easy change. We'd just have to add some sort of filtering function to either Node.get_contents(), Sig.Calculator.csig(), or Sig.MD5.signature(). This filtering function would be given the contents of the file, and it would return the contents with all the C comments stripped out. Node.get_contents() would probably be the best place to do this, because we'd want the filter function to be somehow configurable based on the file extension or something. In a SConstruct file, maybe you'd do this:
defcomment_filter(contents):
# filter out commentsSignatureFilter(comment_filter, ".c .h .cpp .hpp")
omynous said at 2005-01-31 22:42:02
Duplicates 649436
issues@scons said at 2005-01-31 22:42:02
Converted from SourceForge tracker item 540682
gregnoel said at 2008-05-26 00:11:57
Assigning to MatiGruca.
Welcome aboard. Here's the issue that corresponds to your GSoC project.
matigruca said at 2008-05-31 14:40:56
Re-post from scons-dev (so that it won't get lost) + a little update at the bottom:
Add new module (Comments.py? Stripper.py?) containing (a) all the stripping functions and (b) a control function to decide which function to use in particular case.
Modify changed_content() method (from 'File' class from Node/FS.py) to run the control function from - let's say - Comments.py module to check if we can strip part of the file or run the good old get_csig() to check the MD5 signature for the whole file otherwise.
The changed_content() method could look like this:
defchanged_content(self, target, prev_ni):
ret=SCons.Comments.Stripper(self, target, prev_ni)
ifret==SIGS_DIFFER: # different signaturesreturn1elifret==SIGS_EQUAL: # equal signaturesreturn0# stripper didn't know what to do, so go the usual way...cur_csig=self.get_csig()
try:
returncur_csig!=prev_ni.csigexceptAttributeError:
return1
I have hacked it in this way (only a provisory C-comments/code stripping functions) and it works fine, but there is one thing I haven't thought about before.
I have to store (max) three checksums for every file in the .sconsign.dblite:
checksum for the whole file
checksum for the code
checksum for the comments
I have added two additional elements to the field_list (let's call them 'psig' and 'osig' as for now) list in FileNodeInfo class (Node/FS.py) and it works fine, but, I guess, I should also add get_psig() and get_osig() functions to Node/__init__.py (just like get_csig() in 'Node' class in Node/__init__.py).
Any comments?
I've already created C-code/comments stripping functions + tests (but I'm still working to improve them). Now I will go for D's nested comments.
Cheers,
Mateusz
gregnoel said at 2008-05-31 16:48:41
I have no record of having received a message with this content; it must have been dropped somewhere.
Add new module (Comments.py? Stripper.py?) containing (a) all the stripping functions and (b) a control function to decide which function to use in particular case.
A place to put all the stripping functions is a good idea, but it's the Builder that knows both the source and the target, so it's the Builder that should say which stripping function to use. If it doesn't nominate one, the default would be get_contents() (i.e., the whole file). This is much simpler than a control function that has to know and understand all possible sources and targets (and probably faster, to boot).
There's no need to change anything in the signature calculations other than to select the stripping function in the correct places. (In fact, it would be bad design to do otherwise: the signature subsystem is already complex enough, so let it figure out what it wants to keep.) Check with Steven about where to call the stripping function and whether the cache index needs to be tweaked.
Have you created a branch to contain this work yet? Oh, and you should change the status of this issue to STARTED.
matigruca said at 2008-06-02 11:38:34
I have no record of having received a message with this content; it must have been dropped somewhere.
This is really strange. Right after sending this mail to scons-dev I've checked scons.org mail archives and my mail was there, but now it is gone! Now I know, why there was no response :)
There's no need to change anything in the signature calculations other than to select the stripping function in the correct places.
Do you mean adding new database fields (psig and osig in my previous post) for new MD5 checksums?
If yes, I am in trouble, since I can't see a way to compare the checksums for code/comments if all I got is MD5 sum for whole file. I agree that signature subsystem is too complex and I know, that adding new checksum fields to the database is a pain for current SCons's users, but what else can I do?
Have you created a branch to contain this work yet?
No, can I do this by myself? I thought that I need some special write priviledges.
Thanks for comments!
Mateusz
gregnoel said at 2008-06-02 15:08:48
You don't need to add any fields to the signature subsystem; you don't have to do much beyond providing the portion of the file to be checksummed. You don't need to store the MD5 sum of the whole file; you don't need to store the MD5 sum of the excluded portions. All you need to do is provide the piece that should be checked for changes and let the signature subsystem take care of the rest.
Nag Steven to do a brain dump of the signature subsystem into the Developers' Guide in the wiki (at least the internal API) and then bug him about anything you don't understand. Record all of the questions and answers in the wiki and one of us will clean it up later. Although I can't seem to find it now, he did part of this when he was describing the proposed Decider() functionality on the mailing list, so you can start there. In essence, the idea is that if you ask for a bit of signature, the subsystem will notice what's being used and remember it for future use. You don't have to know or care how it's done.
As for SVN permissions, as far as I know, you should be good to go. Drop us a line if it doesn't work.
matigruca said at 2008-07-07 10:04:48
Hi,
nagging Steven failed (please, don't treat it as objection, I understand that he is busy and moreover it made me to sit more, learn more and think more - which is good), so I am coming up with my new idea about where to place the comments stripping framework.
I added new attribute to BuilderBase class called 'stripper' along with two methods: add_stripper() and get_stripper().
Now, for every initialized builder I can add adequate stripper in tool-specific initialization files.
That's all. I know that it is a lot better than previous propositions and I hope that it'll good enough to be accepted :)
Greg, thanks for comments, when I have read them for the first time, I didn't understand them as good as I do now.
Looking back at my work I think that I should've started with Python Binary Builder[1]. Working on PBB gave me some knowledge I needed to propose the above. Greg, how do you think - should I reschedule my project on my wiki page, or should I just mention it in my mid-term evaluation
report?
(Gary, you should add your comments to the issue, not reply via email. Steven, consider yourself heartily nagged.)
Yes, either the Builder or the Executor is the right place to attach the stripper function; attaching to the Executor is a bit more complicated, but would have the advantage that it could be overridden on a case-by-case basis if desired.
As for the names of the stripper functions, note that some are more widely usable than others. A stripper for #-comments could not only be used for the shell and Python but also dozens of languages; its name should reflect that.
matigruca said at 2008-07-08 14:29:51
Hi Greg,
Yes, either the Builder or the Executor is the right place to attach the stripper function; attaching to the Executor is a bit more complicated, but would have the advantage that it could be overridden on a case-by-case basis if desired.
OK, I will investigate the Executor variant.
As for the names of the stripper functions, note that some are more widely usable than others. A stripper for #-comments could not only be used for the shell and Python but also dozens of languages; its name should reflect that.
I agree. What about StripHashComments and StripHashCode?
Regards,
Mati Gruca
gregnoel said at 2008-08-05 06:24:51
GSoC issues should be assigned to the 'anytime' milestone, since they are outside the normal release cycle.
The text was updated successfully, but these errors were encountered:
This issue was originally created at: 2002-04-07 13:05:02.
This issue was reported by:
nyber
.nyber said at 2002-04-07 13:05:02
omynous said at 2005-01-31 22:42:02
issues@scons said at 2005-01-31 22:42:02
gregnoel said at 2008-05-26 00:11:57
matigruca said at 2008-05-31 14:40:56
gregnoel said at 2008-05-31 16:48:41
matigruca said at 2008-06-02 11:38:34
gregnoel said at 2008-06-02 15:08:48
matigruca said at 2008-07-07 10:04:48
gregnoel said at 2008-07-07 11:09:49
matigruca said at 2008-07-08 14:29:51
gregnoel said at 2008-08-05 06:24:51
The text was updated successfully, but these errors were encountered: