log-parser improvements #1162

angelikatyborska · 2022-07-30T08:39:27Z

There are some improvements to this exercise proposed in the initial PR that got postponed so we can see how the exercise is doing first and maybe collect more improvement ideas first.

Exercise doesn't teach that different delimiters can be used for regular expressions

Maybe add a step that involves parsing several / and ask them to use another delimiter for readability. Can be checked via the analyzer because the delimiter is kept in the AST.

Exercise doesn't teach about compiling regex with variable content

I wanted to add some step that would require an interpolated regex with escaping a variable, but I couldn't come up with anything thematic that wouldn't be better done with String.starts_with?.

How about adding a step to tag_with_user_name/1?
Maybe something like bolding the user names whenever they are mentioned without "User"?

LogParser.tag_with_user_name("[INFO] User Alice created a new project.")
# => "[USER] Alice [INFO] User Alice created a new project"
LogParser.tag_with_user_name("[INFO] User Alice created a new project. Alice has a reputation of 643.")
# => "[USER] Alice [INFO] User Alice created a new project. **Alice** has a reputation of 643."

The text was updated successfully, but these errors were encountered:

antoine-duchenet · 2023-09-19T10:42:45Z

How about adding a step to tag_with_user_name/1? Maybe something like bolding the user names whenever they are mentioned without "User"?

LogParser.tag_with_user_name("[INFO] User Alice created a new project.")
# => "[USER] Alice [INFO] User Alice created a new project"
LogParser.tag_with_user_name("[INFO] User Alice created a new project. Alice has a reputation of 643.")
# => "[USER] Alice [INFO] User Alice created a new project. **Alice** has a reputation of 643."

What would be an ideal Regex solution to this ? Maybe something like :

def tag_with_user_name(line) do
  case Regex.run(~r/User\s+(\S+)/, line) do
    [_, name] ->
      orphans_regex = ~r/(?<!User\s)(#{name})/

      # May use #{name} instead of the capture group
      bold_orphans = String.replace(line, orphans_regex, "**\\1**")

      "[USER] #{name} #{bold_orphans}"

    _ ->
      line
  end
end

Such a solution uses a negative lookbehind which (I think) has to be of fixed length ((?<!User\s), not (?<!User\s+), and (?<!User)(\s+) or (?<!User)(\s*) won't work as expected). It collides with the previous part of the exercise asking to be able to handle any number of spaces between the "User" prefix and the user name ; I feel like it breaks the elegance of the exercise a bit (and may create confusion for the students).

Did I miss some elegant Regex way to achieve this ?

Alternative idea

As an alternative, what about a function hilighting (it can be by bolding) some important words during parse ? Since the important words could be changed by configuration, it is passed to the function as an argument (list of strings ?).

It would probably be solved by something like this:

@spec highlight_important(String.t(), Enum.t()) :: String.t()
def highlight_important(line, important_words)
def highlight_important(line, []), do: line

def highlight_important(line, important_words) do
  important_regex = ~r/\b(#{Enum.join(important_words, "|")})\b/
  String.replace(line, important_regex, "**\\1**")
end

angelikatyborska mentioned this issue Jul 30, 2022

Replace date-parser with log-parser (regular expressions) #1148

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

log-parser improvements #1162

log-parser improvements #1162

angelikatyborska commented Jul 30, 2022

antoine-duchenet commented Sep 19, 2023 •

edited

Loading

log-parser improvements #1162

log-parser improvements #1162

Comments

angelikatyborska commented Jul 30, 2022

Exercise doesn't teach that different delimiters can be used for regular expressions

Exercise doesn't teach about compiling regex with variable content

antoine-duchenet commented Sep 19, 2023 • edited Loading

Alternative idea

antoine-duchenet commented Sep 19, 2023 •

edited

Loading