-
-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🧐 Data exploring/inspecting #615
Comments
Sorry, what do you mean? Also, I'm not too knowledge in data science, and I didn't understand why this data exploration/inspecting would need a commit? And if you mean adding another grapth and/or another paragraph to a notebook, then couldn't we use 📝 or ✨ ? |
I saw that my PR did not pass all tests, so I was wondering about how to make them pass - and stated I didn't find anything in the documentation About why we need a gitmoji about data exploring, let's understand how data exploration is done. These kinds of explorations are made in parts, much like any other piece of code. So you might want to commit things like "Added distribution plots for feature X" and then maybe "Exploration of textual data points" and stuff like that. As I see it, there are no existing gitmoji equivalent since these are not new "features" but rather the act of exploring existing ones - this also extends to exploring results of model's predictions and similar things. To sum up, the exploring/inspecting is instead of introducing new features is a code oriented way to explore existing ones |
Hey @eliorc 👋 Thanks for opening a PR. Like @vhoyer I don't know a lot about data science's workflow and can't really understand what you mean with 'data exploring'. Is it just about new code that do something? Does it create new files? What exactly happens when you do data exploring in term of code or file manipulation? |
So about the tests, the thing that is breaking it probably the snapshots, I will assume you are not familiarized with those kinds of tests, and I like the ideia of adding this info on the repo, so I will do this later (mostly because this is not the first time this raises doubt haha), but to resolve your test errors, you should open the repo and run I see, I get it now, but why wouldn't you consider adding new blocks in notebooks a new feature? it's adding a table or a graph to the document that was not there before, right? On that same line, what would configure a feature in those cases in your opinion? |
We consider features to be new "functionalities" created in the project. For example, a new model architecture, a new preprocessing pipeline a new evaluation scheme. Basically these are things that can be reused in the future once they were developed. Explorations/inspections are a one-off action, and definitely not reusable, as the insights drawn from them are only relevant to the data they were made upon. You might reuse the features you developed to create those insights - but the usage of those features in order to generate the insights is a different thing. If you want to find the equivalent in classic engineering, I would say that data explorations/inspections are like unit tests. It is an action that is executed in a certain point of time and has results - the test runs are not features just like the explorations are not (and again, maybe you develop features to later be used in the tests, but that's a different thing). |
@johannchopin what happens is you write some code in an interactive notebook, you execute it and you most likely create plots and even write some markdown in order to convey your insights. You will usually do it in logical blocks for example let's say you have developed 100 different models - you will want to inspect and explore their predictions, looking for their weak points to have a clearer understanding on what should you do in your next research iteration. Also I explained this in two comments ago with examples |
Ok, I agree with its inclusion in gitmoji, will take a look at your PR later, what do the rest of the gang think about it? |
Ok I also agree with the integration of this emoji. But what would be the description exactly? Is there a better emoji for that because IMO 🧐 isn't explicit enough. |
I have more candidates, but the monocle is my favorite - here are all of them sorted with my personal preference
|
About that, I think "Data exploration/inspection" covers the use cases of exploring (like EDA and alike) and inspection (inspecting model results and comparisons etc.) |
So lets vote on the emoji 😅
|
Hello @carloscuesta 😎!
:monocle_face:
About testing, I read the contribution guide and haven't seen anything about that... So how should I go about testing?
The text was updated successfully, but these errors were encountered: