-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to remove "comments" from configs #134 #137
Conversation
Hi @tandreadi thank you for your contribution! |
Hey @sfraint! Is there any update? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r1.
Reviewable status: 1 of 8 files reviewed, 6 unresolved discussions (waiting on @sfraint and @tandreadi)
a discussion (no related file):
Good start! I've left some comments/questions in-line, below.
.travis.yml, line 6 at r1 (raw file):
- 3.7
Changes to CI testing behavior is unrelated to this feature and should not be included in this PR (but happy to take this as a separate PR 🙂 )
netconan/netconan.py, line 73 at r1 (raw file):
parser.add_argument('-k', '--keyword-remover', default=None, help='List of comma seperated keywords to remove.')
I'd suggest calling this something other than keyword
/ keyword-remover
. keyword-remover
is a little vague/misleading and with the description it sounds like it will remove just the word, not the entire line.
Maybe call it remove-lines
or something like that and update the description accordingly (e.g. List of comma separated words which should trigger removing a line entirely.
).
netconan/sensitive_item_removal.py, line 204 at r1 (raw file):
Quoted 9 lines of code…
def _generate_conflicting_reserved_word_list(self, keywords): """Return a set of keywords that may conflict with the specified default reserved words.""" conflicting_words = set() for keyword in keywords: conflicting_words.update(set([w for w in self.reserved_words if keyword in w])) if conflicting_words: logging.warning('Specified keywords overlap with reserved words. ' 'The following reserved words will be preserved: %s', conflicting_words) return conflicting_words
This function looks similar to _generate_conflicting_reserved_word_list
in another anonymizer; it would be good to just create a private helper function available to all anonymizers.
netconan/sensitive_item_removal.py, line 216 at r1 (raw file):
for w in words: if w in self.reserved_words: return line
I think this will ultimately need to be removed.
Right now, the default set of reserved
words is made up of keywords supported in various network config syntaxes (e.g. contains set
, interface
, description
, ip
, acl
, neighbor
...)
This means a user will not be able to remove lines like those posed in the original issue, e.g.
set interfaces <interface-name> description "My sensitive text here"
since this line contains multiple reserved
keywords
netconan/sensitive_item_removal.py, line 221 at r1 (raw file):
self.remove = True return 'remove line'
You can avoid worrying about setting a remove
boolean since you are already returning the processed line.
I would suggest just returning None
or an empty string (""
) in this case, similar to the other anonymizers.
tests/unit/test_sensitive_item_removal.py, line 309 at r1 (raw file):
def test_remove_line():
Make sure to test line removal with some real config syntax lines that we might expect users to want to remove.
e.g. Juniper interface description
set interfaces interface-name description sensitive
and Cisco IOS BGP neighbor description
neighbor 192.168.3.2 description sensitive
and ACL remarks
-Update source code in netconan.py -Update testing code in test_parse_args.py Add LineRemover class -Add __init__() method -Add _generate_conflicting_reserved_word_list() method Add remove_line() and _generate_keyword_regex() method Update LineRemover class -Add boolean instance variable remove -Add get_remove() method -Correct expression in _generate_conflicting_reserved_word_list method -Replace return value None with 'remove line' in remove_line() method Add remove line option -Add if-statement to anonymize_file() method -Create line_remover object in anonymize_files() method Cast to string input_file and output_file variables: Cast to string output_file in assert expression Add keyword-remover optional argument to README Test the _generate_keyword_regex() method
0d4273e
to
445b6fd
Compare
netconan/sensitive_item_removal.py, line 216 at r1 (raw file): Previously, sfraint (Spencer Fraint) wrote…
I was also concerned about this and I had in mind that the current implementation might not meet the issue's requirements. So, if the user types the option "remove-lines", each line that consists of the given words will be removed, whether or not, they belong to the reserved words. Did I understand it correctly? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 8 files reviewed, 6 unresolved discussions (waiting on @tandreadi)
netconan/sensitive_item_removal.py, line 216 at r1 (raw file):
Previously, tandreadi (Andreadi Theodora) wrote…
I was also concerned about this and I had in mind that the current implementation might not meet the issue's requirements. So, if the user types the option "remove-lines", each line that consists of the given words will be removed, whether or not, they belong to the reserved words. Did I understand it correctly?
Correct
netconan/sensitive_item_removal.py, line 204 at r1 (raw file): Previously, sfraint (Spencer Fraint) wrote…
After the discussion about the meaning of the reserved words in this issue, I was thinking that this method in LineRemover class might be unnecessary. However, there is the case where the user selects both options Do you think that we have to consider this case and modify properly this method? Meaning that each line that consists of the given reserved words won't be removed. |
netconan/sensitive_item_removal.py, line 216 at r1 (raw file): Previously, sfraint (Spencer Fraint) wrote…
Also, I was thinking that removing this block might arise a problem in case the user gives a list of reserved words and a list of keywords. A line might contain words from both of the lists. Assuming that in this case, we would like to prevent the removal of this line, the meaning of the instance variable |
Co-authored-by: Elviraant <t8170007@dias.aueb.gr>
22c859e
to
a9b75fa
Compare
Add examples to tests Upadate end_to_end test Co-authored-by: Elviraant <t8170007@dias.aueb.gr>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 6 unresolved discussions (waiting on @sfraint and @tandreadi)
netconan/sensitive_item_removal.py, line 204 at r1 (raw file):
Previously, tandreadi (Andreadi Theodora) wrote…
After the discussion about the meaning of the reserved words in this issue, I was thinking that this method in LineRemover class might be unnecessary.
However, there is the case where the user selects both options
--reserved-words
and--remove-lines
and the given lists might have words in common.Do you think that we have to consider this case and modify properly this method? Meaning that each line that consists of the given reserved words won't be removed.
I think in that case, it is fine to remove the line. A short note can be added to the remove-lines
description, to make it clear that lines are removed even if they contain reserved words.
netconan/sensitive_item_removal.py, line 216 at r1 (raw file):
Previously, tandreadi (Andreadi Theodora) wrote…
Also, I was thinking that removing this block might arise a problem in case the user gives a list of reserved words and a list of keywords. A line might contain words from both of the lists.
Assuming that in this case, we would like to prevent the removal of this line, the meaning of the instance variable
reserved_words
could be re-defined and refer only to the given reserved words. What's your opinion on this assumption?
Is this answered by my response above? or is this question asking something else?
Please reopen if you wish to continue this PR. |
Hello @dhalperi, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 9 files at r2, 1 of 2 files at r3.
Reviewable status: 3 of 8 files reviewed, 9 unresolved discussions (waiting on @sfraint and @tandreadi)
README.rst, line 135 at r3 (raw file):
Quoted 4 lines of code…
--remove-lines REMOVE_LINES List of comma separated words which should trigger removing a line entirely. The line will be removed even if it contains a reserved word
I think this text does not line up with the help text generated by netconan
.
The text here, in the readme should be identical to the output of netconan --help
(you will need to update the parser.add_argument('--remove-lines', ...
line in netconan.py
)
netconan/anonymize_files.py, line 37 at r3 (raw file):
keywords=None
nit: I would rename this to something more informative, e.g. remove_line_keywords=None
netconan/anonymize_files.py, line 66 at r3 (raw file):
logging.warning('The line will be removed even if it contains a reserved word')
I would say, a warning is not necessary here, since it will occur any time the user specifies the remove-lines
param.
netconan/netconan.py, line 73 at r1 (raw file):
Previously, sfraint (Spencer Fraint) wrote…
parser.add_argument('-k', '--keyword-remover', default=None, help='List of comma seperated keywords to remove.')
I'd suggest calling this something other than
keyword
/keyword-remover
.keyword-remover
is a little vague/misleading and with the description it sounds like it will remove just the word, not the entire line.Maybe call it
remove-lines
or something like that and update the description accordingly (e.g.List of comma separated words which should trigger removing a line entirely.
).
Looks better, thanks!
netconan/netconan.py, line 131 at r3 (raw file):
keywords = None
nit: I'd rename this to something more descriptive - either remove_lines
or remove_lines_keywords
netconan/sensitive_item_removal.py, line 216 at r1 (raw file):
Previously, sfraint (Spencer Fraint) wrote…
Is this answered by my response above? or is this question asking something else?
@tandreadi, as discussed above, I think it is okay to remove lines that contain reserved words.
did you still have a question here?
netconan/sensitive_item_removal.py, line 197 at r3 (raw file):
Remove the input line.
Please update the docstring to be a little more informative, e.g.
"""Remove the input line if it contains any of the specified keywords.
Returns either the original line or an empty string."""
tests/unit/test_anonymize_files.py, line 193 at r3 (raw file):
Quoted 4 lines of code…
def test_anonymize_file_with_line_remover(tmpdir): ... def test_anonymize_file_without_line_remover(tmpdir): ...
Are these two tests redundant with the tests above? or are they effectively testing something different than the tests above?
tests/unit/test_sensitive_item_removal.py, line 293 at r3 (raw file):
Quoted 14 lines of code…
def test_remove_line(): """Test if the line is removed.""" line = "set interfaces interface-name description sensitive" keywords = ['interface-name'] remover = LineRemover(keywords) result = remover.remove_line(line) assert(result == '') def test_remove_line_with_no_keywords(): """Test if the line is not removed.""" line = "ip address 192.168.2.1 255.255.255.255" keywords = ['neighbor'] remover = LineRemover(keywords) result = remover.remove_line(line) assert(result == line)
these two tests can be combined into one parameterized test, with @pytest.mark.parametrize
like above
also, there should be a few more conditions tested, like:
- what if keyword is contained in a word in the line? e.g. keyword is
add
, line issomething address foobar
- what if the keyword is at the beginning of the line? e.g. keyword is
add
, line isadd something
- what if the keyword is at the end of the line? e.g. keyword is
add
, line issomething add
Hey folks! Looks like this has gone idle again. If you would like to resume, please push a commit responding to the review, and then comment here. Thanks! |
Implementing this by using a list of keywords given by the user, separated by commas, and removing any line that contains any of these words. The removal is preserved if the line contains any of the words defined in default_reserved_words.py file.
Here is an example:
Running the following command:
netconan -i cisco.cfg -o result.cfg -k neighbor,keyword
this warning will appear:
The cisco.cfg file is:
And the result.cfg file will be:
Could you give me some feedback about my work so far? @dhalperi
This change is