Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace date-parser with log-parser (regular expressions) #1148

Merged
merged 17 commits into from
Jul 30, 2022

Conversation

angelikatyborska
Copy link
Contributor

Closes #941

@github-actions
Copy link
Contributor

Thank you for contributing to exercism/elixir 💜 🎉. This is an automated PR comment 🤖 for the maintainers of this repository that helps with the PR review process. You can safely ignore it and wait for a maintainer to review your changes.

Based on the files changed in this PR, it would be good to pay attention to the following details when reviewing the PR:

  • General steps

    • 🏆 Does this PR need to receive a label with a reputation modifier (x:size/{tiny,small,medium,large,massive})? (A medium reputation amount is awarded by default, see docs)
  • Any exercise changed

    • 👤 Does the author of the PR need to be added as an author or contributor in <exercise>/.meta/config.json (see docs)?
    • 🔬 Do the analyzer and the analyzer comments exist for this exercise? Do they need to be changed?
    • 📜 Does the design file (<exercise>/.meta/design.md) need to be updated to document new implementation decisions?
  • Concept exercise changed

    • 🌲 Do prerequisites and practices in config.json need to be updated?
    • 📖 Does the concept introduction provide all necessary information to solve this exercise?
  • Concept exercise tests changed

    • ⚪️ Are all tests un-skipped?
    • 🔢 Are all tests annotated with @tag task_id?
    • 🐈 Can all tests be understood by reading the test's block only (e.g. no module attributes, no setup functions)?
  • Concept changed

    • 👤 Does the author of the PR need to be added as an author or contributor in <concept>/.meta/config.json (see docs)?
    • 📖 Does the concept introduction provide all necessary information to solve this exercise?

Automated comment created by PR Commenter 🤖.

@angelikatyborska angelikatyborska added the x:size/large Large amount of work label Jun 12, 2022
@angelikatyborska angelikatyborska marked this pull request as ready for review June 12, 2022 16:12
## Out of scope

- Teaching the syntax of regular expressions
- Compiling Regular expressions with variable content
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to add some step that would require an interpolated regex with escaping a variable, but I couldn't come up with anything thematic that wouldn't be better done with String.starts_with? :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a step to tag_with_user_name/1?
Maybe something like bolding the user names whenever they are mentioned without "User"?

LogParser.tag_with_user_name("[INFO] User Alice created a new project.")
# => "[USER] Alice [INFO] User Alice created a new project"
LogParser.tag_with_user_name("[INFO] User Alice created a new project. Alice has a reputation of 643.")
# => "[USER] Alice [INFO] User Alice created a new project. **Alice** has a reputation of 643."

Copy link
Contributor

@jiegillet jiegillet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think it's definitely an improvement over the date-parser.

I left a couple of small suggestions, but I'm already approving. Ping me on Slack if you need another round of review, I might miss the notification otherwise.

Here is my solution as reference

defmodule LogParser do
  def valid_line?(line) do
    regex = ~r/^\[(DEBUG|INFO|WARNING|ERROR)\]/
    line =~ regex
  end 

  def split_line(line) do
    regex = ~r/<[~\*=-]*>/
    String.split(line, regex)
  end 

  def remove_artifacts(line) do
    regex = ~r/end-of-line\d+/i
    String.replace(line, regex, "") 
  end 

  def tag_with_user_name(line) do
    regex = ~r/User\s+(\S+)/

    case Regex.run(regex, line) do
      [_user_name, name] -> "[USER] #{name} " <> line
      _ -> line
    end 
  end 
end


_Anchors_ are used to tie the regular expression to the beginning or end of the string to be matched:
If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always the whole string, and the following elements are matched groups.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add an example, something like

Regex.run(~r/test number (\d+)/, "This is test number 256")
# => ["test number 256", "256"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: 394686e (#1148)

It also made me realize that the initial description was wrong :) the first element isn't always the whole input string, only the part of it that was a match for the regex.


The `=~/2` operator is useful to perform a regex match on a string to return a `boolean` result.
~~~~exercism/note
This exercise assumes that you already know regular expression syntax, including character classes, quantifiers, groups, and captures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we offer a link or two for those who don't?
To be honest, I just googled learning regex and found plenty of material, but there are also a bunch of links in the hints we could reuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Know that some String functions accept regular expressions, e.g. match?, replace, split.
- Know about modifiers (e.g. unicode, case-insensitive)
- Know how to get a value from a captured group
- Know that sigils can be used with different delimiters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We mention this point in the intro, but we don't practice it in an exercise.
Maybe we could have one exercise that involves parsing several / and ask them to use another delimiter for readability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It would also be possible to do an analyzer check for the delimiter because that information is kept in the AST:

iex(2)> quote do
...(2)> ~r()
...(2)> end
{:sigil_r, [delimiter: "(", context: Elixir, import: Kernel],
 [{:<<>>, [], [""]}, []]}

## Out of scope

- Teaching the syntax of regular expressions
- Compiling Regular expressions with variable content
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a step to tag_with_user_name/1?
Maybe something like bolding the user names whenever they are mentioned without "User"?

LogParser.tag_with_user_name("[INFO] User Alice created a new project.")
# => "[USER] Alice [INFO] User Alice created a new project"
LogParser.tag_with_user_name("[INFO] User Alice created a new project. Alice has a reputation of 643.")
# => "[USER] Alice [INFO] User Alice created a new project. **Alice** has a reputation of 643."

@angelikatyborska
Copy link
Contributor Author

I postponed the two improvement ideas (#1162) because I'm low on time and creativity at the moment, but I really want to get this out before the ExHort starts on 08.08.

@angelikatyborska angelikatyborska merged commit 287753c into main Jul 30, 2022
@angelikatyborska angelikatyborska deleted the replace-regex-exercise branch July 30, 2022 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
x:size/large Large amount of work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

date-parser improvements
2 participants