Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commenting changes should not rebuild #193

Open
bdbaddog opened this issue Jan 2, 2018 · 0 comments
Open

Commenting changes should not rebuild #193

bdbaddog opened this issue Jan 2, 2018 · 0 comments
Milestone

Comments

@bdbaddog
Copy link
Contributor

bdbaddog commented Jan 2, 2018

This issue was originally created at: 2002-04-07 13:05:02.
This issue was reported by: nyber.
nyber said at 2002-04-07 13:05:02

I would like to request having an option of not taking source code comments into account when computing the md5 checksum.

I have a cvs log comment block in my source files that gets added to every time I do a cvs checkin. So after a checkin scons will rebuild all of those files because the md5 checksum changes. But all that is modified is a comment block at the top of the file.

---The following is Anthony Roach's comments on how this could be done.

That would be a pretty easy change. We'd just have to add some sort of filtering function to either Node.get_contents(), Sig.Calculator.csig(), or Sig.MD5.signature(). This filtering function would be given the contents of the file, and it would return the contents with all the C comments stripped out. Node.get_contents() would probably be the best place to do this, because we'd want the filter function to be somehow configurable based on the file extension or something. In a SConstruct file, maybe you'd do this:

def comment_filter(contents):
    # filter out comments

SignatureFilter(comment_filter, ".c .h .cpp .hpp")

omynous said at 2005-01-31 22:42:02

Duplicates 649436

issues@scons said at 2005-01-31 22:42:02

Converted from SourceForge tracker item 540682

gregnoel said at 2008-05-26 00:11:57

Assigning to MatiGruca.

Welcome aboard. Here's the issue that corresponds to your GSoC project.

matigruca said at 2008-05-31 14:40:56

Re-post from scons-dev (so that it won't get lost) + a little update at the bottom:

I've been thinking how to implement the comments stripper (http://www.scons.org/wiki/GSoC2008/MatiGruca), and that's what I've come up with.

  1. Add new module (Comments.py? Stripper.py?) containing (a) all the stripping functions and (b) a control function to decide which function to use in particular case.

  2. Modify changed_content() method (from 'File' class from Node/FS.py) to run the control function from - let's say - Comments.py module to check if we can strip part of the file or run the good old get_csig() to check the MD5 signature for the whole file otherwise.

The changed_content() method could look like this:

   def changed_content(self, target, prev_ni):
       ret = SCons.Comments.Stripper(self, target, prev_ni)
       if ret == SIGS_DIFFER: # different signatures
           return 1
       elif ret == SIGS_EQUAL: # equal signatures
           return 0
 
       # stripper didn't know what to do, so go the usual way...
       cur_csig = self.get_csig()
 
       try:
           return cur_csig != prev_ni.csig
       except AttributeError:
           return 1

I have hacked it in this way (only a provisory C-comments/code stripping functions) and it works fine, but there is one thing I haven't thought about before.

I have to store (max) three checksums for every file in the .sconsign.dblite:

  • checksum for the whole file
  • checksum for the code
  • checksum for the comments
    I have added two additional elements to the field_list (let's call them 'psig' and 'osig' as for now) list in FileNodeInfo class (Node/FS.py) and it works fine, but, I guess, I should also add get_psig() and get_osig() functions to Node/__init__.py (just like get_csig() in 'Node' class in Node/__init__.py).

Any comments?

I've already created C-code/comments stripping functions + tests (but I'm still working to improve them). Now I will go for D's nested comments.

Cheers,
Mateusz

gregnoel said at 2008-05-31 16:48:41

I have no record of having received a message with this content; it must have been dropped somewhere.

Add new module (Comments.py? Stripper.py?) containing (a) all the stripping functions and (b) a control function to decide which function to use in particular case.

A place to put all the stripping functions is a good idea, but it's the Builder that knows both the source and the target, so it's the Builder that should say which stripping function to use. If it doesn't nominate one, the default would be get_contents() (i.e., the whole file). This is much simpler than a control function that has to know and understand all possible sources and targets (and probably faster, to boot).

There's no need to change anything in the signature calculations other than to select the stripping function in the correct places. (In fact, it would be bad design to do otherwise: the signature subsystem is already complex enough, so let it figure out what it wants to keep.) Check with Steven about where to call the stripping function and whether the cache index needs to be tweaked.

Have you created a branch to contain this work yet? Oh, and you should change the status of this issue to STARTED.

matigruca said at 2008-06-02 11:38:34

I have no record of having received a message with this content; it must have been dropped somewhere.

This is really strange. Right after sending this mail to scons-dev I've checked scons.org mail archives and my mail was there, but now it is gone! Now I know, why there was no response :)

There's no need to change anything in the signature calculations other than to select the stripping function in the correct places.

Do you mean adding new database fields (psig and osig in my previous post) for new MD5 checksums?

If yes, I am in trouble, since I can't see a way to compare the checksums for code/comments if all I got is MD5 sum for whole file. I agree that signature subsystem is too complex and I know, that adding new checksum fields to the database is a pain for current SCons's users, but what else can I do?

Have you created a branch to contain this work yet?

No, can I do this by myself? I thought that I need some special write priviledges.

Thanks for comments!
Mateusz

gregnoel said at 2008-06-02 15:08:48

You don't need to add any fields to the signature subsystem; you don't have to do much beyond providing the portion of the file to be checksummed. You don't need to store the MD5 sum of the whole file; you don't need to store the MD5 sum of the excluded portions. All you need to do is provide the piece that should be checked for changes and let the signature subsystem take care of the rest.

Nag Steven to do a brain dump of the signature subsystem into the Developers' Guide in the wiki (at least the internal API) and then bug him about anything you don't understand. Record all of the questions and answers in the wiki and one of us will clean it up later. Although I can't seem to find it now, he did part of this when he was describing the proposed Decider() functionality on the mailing list, so you can start there. In essence, the idea is that if you ask for a bit of signature, the subsystem will notice what's being used and remember it for future use. You don't have to know or care how it's done.

As for SVN permissions, as far as I know, you should be good to go. Drop us a line if it doesn't work.

matigruca said at 2008-07-07 10:04:48

Hi,

nagging Steven failed (please, don't treat it as objection, I understand that he is busy and moreover it made me to sit more, learn more and think more - which is good), so I am coming up with my new idea about where to place the comments stripping framework.

I added new attribute to BuilderBase class called 'stripper' along with two methods: add_stripper() and get_stripper().

Now, for every initialized builder I can add adequate stripper in tool-specific initialization files.

For example in Tool/cc.py:

 71     from SCons.Comments import CComments
 72
 73     for suffix in CSuffixes:
 74         static_obj.add_action(suffix, SCons.Defaults.CAction)
 75         shared_obj.add_action(suffix, SCons.Defaults.ShCAction)
 76         static_obj.add_emitter(suffix, SCons.Defaults.StaticObjectEmitter)
 77         shared_obj.add_emitter(suffix, SCons.Defaults.SharedObjectEmitter)
 78         static_obj.add_stripper(suffix, CComments) ### NEW!
 79         shared_obj.add_stripper(suffix, CComments) ### NEW!

(should CComments function be called StripCComments?)

or in Tool/FortranCommon.py:

126     from SCons.Comments import FortranComments
127     for suffix in suffixes:
128         static_obj.add_action(suffix, compaction)
129         shared_obj.add_action(suffix, shcompaction)
130         static_obj.add_emitter(suffix, FortranEmitter)
131         shared_obj.add_emitter(suffix, ShFortranEmitter)
132         static_obj.add_stripper(suffix, FortranComments) ### NEW!
133         shared_obj.add_stripper(suffix, FortranComments) ### NEW!
134
135
136     for suffix in ppsuffixes:
137         static_obj.add_action(suffix, compppaction)
138         shared_obj.add_action(suffix, shcompppaction)
139         static_obj.add_emitter(suffix, FortranEmitter)
140         shared_obj.add_emitter(suffix, ShFortranEmitter)
141         static_obj.add_stripper(suffix, FortranComments) ### NEW!
142         shared_obj.add_stripper(suffix, FortranComments) ### NEW!

(FortranComments or StripFortranComments?)

OK, so now I got the right strippers connected to the right suffixes.

I think that changed_content() method in Node/FS.py is the right place
to check for the checksum, so now I do it like that (simplified version):

2697     def changed_content(self, target, prev_ni):
2698         try: ### NEW!
2699             stripper = target.builder.get_stripper(self.suffix)
2700             code = stripper(str(self))
2701             cur_csig = SCons.Util.MD5signature(code)
2702             ninfo = self.get_ninfo()
2703             ninfo.csig = cur_csig
2704         except: ### OLD
2705             cur_csig = self.get_csig()
2706
2707         try:
2708             return cur_csig != prev_ni.csig
2709         except AttributeError:
2710             return 1

That's all. I know that it is a lot better than previous propositions and I hope that it'll good enough to be accepted :)

Greg, thanks for comments, when I have read them for the first time, I didn't understand them as good as I do now.

Looking back at my work I think that I should've started with Python Binary Builder[1]. Working on PBB gave me some knowledge I needed to propose the above. Greg, how do you think - should I reschedule my project on my wiki page, or should I just mention it in my mid-term evaluation
report?

[1] - http://www.scons.org/wiki/InstallPython

Regards,
Mateusz Gruca

gregnoel said at 2008-07-07 11:09:49

(Gary, you should add your comments to the issue, not reply via email. Steven, consider yourself heartily nagged.)

Yes, either the Builder or the Executor is the right place to attach the stripper function; attaching to the Executor is a bit more complicated, but would have the advantage that it could be overridden on a case-by-case basis if desired.

As for the names of the stripper functions, note that some are more widely usable than others. A stripper for #-comments could not only be used for the shell and Python but also dozens of languages; its name should reflect that.

matigruca said at 2008-07-08 14:29:51

Hi Greg,

Yes, either the Builder or the Executor is the right place to attach the stripper function; attaching to the Executor is a bit more complicated, but would have the advantage that it could be overridden on a case-by-case basis if desired.

OK, I will investigate the Executor variant.

As for the names of the stripper functions, note that some are more widely usable than others. A stripper for #-comments could not only be used for the shell and Python but also dozens of languages; its name should reflect that.

I agree. What about StripHashComments and StripHashCode?

Regards,
Mati Gruca

gregnoel said at 2008-08-05 06:24:51

GSoC issues should be assigned to the 'anytime' milestone, since they are outside the normal release cycle.

@bdbaddog bdbaddog added this to the anytime milestone Jan 2, 2018
@mwichmann mwichmann removed the P3 label Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants