Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP8 Fixes for Summarization. #1017

Closed
wants to merge 10 commits into from
Closed

PEP8 Fixes for Summarization. #1017

wants to merge 10 commits into from

Conversation

souravsingh
Copy link
Contributor

No description provided.

@@ -18,7 +18,8 @@ class BM25(object):

def __init__(self, corpus):
self.corpus_size = len(corpus)
self.avgdl = sum(map(lambda x: float(len(x)), corpus)) / self.corpus_size
self.avgdl = sum(map(lambda x: float(len(x)), corpus)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason for this change?

# Arguments on first line forbidden when not using vertical alignment.
foo = long_function_name(var_one, var_two,
    var_three, var_four)```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 error E501:Line longer than 79 characters.

Does this line need modification?

Copy link
Owner

@piskvorky piskvorky Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ignore the 79 chars guideline, it is silly.

Put line breaks in places that are conceptually and visually meaningful. Try not to exceed 120 characters per line (unless absolutely necessary).

@piskvorky
Copy link
Owner

piskvorky commented Nov 15, 2016

This PR has the same issues as the previous ones: makes code harder to read and maintain (unwanted line breaks, vertical indent).

Copy link
Contributor

@tmylk tmylk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove length of line changes

@@ -16,5 +16,6 @@ def build_graph(sequence):

def remove_unreachable_nodes(graph):
for node in graph.nodes():
if sum(graph.edge_weight((node, other)) for other in graph.neighbors(node)) == 0:
if sum(graph.edge_weight((node, other))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't check for lenght of line

@@ -161,7 +161,8 @@ def _get_combined_keywords(_keywords, split_text):
if word in _keywords:
combined_word = [word]
if i + 1 == len_text:
result.append(word) # appends last word if keyword and doesn't iterate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@tmylk
Copy link
Contributor

tmylk commented Nov 24, 2016

do you know how to disable the 'line length' rule in a PEP8 check? Many lines are still unnecessarily corrected.

@souravsingh
Copy link
Contributor Author

There is a command line argument called -max-line-length which can set the maximum line length for scripts(like 120 chars or so, default is 79).

@piskvorky
Copy link
Owner

piskvorky commented Nov 28, 2016

There are syntax errors in this PR (\ probability_matrix, indent of diff's last line...), as well as unwanted changes (vertical indent). How was this tested?

I am really uneasy about all these "PEP8 fix" pull requests, it seems we're going in circles.


def get_score(self, document, index, average_idf):
score = 0
for word in document:
if word not in self.f[index]:
continue
idf = self.idf[word] if self.idf[word] >= 0 else EPSILON * average_idf
score += (idf*self.f[index][word]*(PARAM_K1+1)
/ (self.f[index][word] + PARAM_K1*(1 - PARAM_B+PARAM_B*self.corpus_size / self.avgdl)))
score += (idf * self.f[index][word] * (PARAM_K1 + 1) /
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No vertical indent.
Splitting the mega-expression into something saner (subexpressions) will help both readability and line length.

@@ -110,12 +110,17 @@ def _get_sentences_with_word_count(sentences, word_count):
return selected_sentences


def _extract_important_sentences(sentences, corpus, important_docs, word_count):
def _extract_important_sentences(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not.

important_sentences = _get_important_sentences(sentences, corpus, important_docs)

# If no "word_count" option is provided, the number of sentences is
# reduced by the provided ratio. Else, the ratio is ignored.
return important_sentences if word_count is None else _get_sentences_with_word_count(important_sentences, word_count)
return important_sentences if word_count is None else _get_sentences_with_word_count(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

souravsingh and others added 4 commits November 28, 2016 18:37
I have made some changes to the expression for calculation of score to match PEP8 specifications.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants