Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] econometrics\load macrodata Cannot format dates properly to set the index. #4713

Closed
Tracked by #4741 ...
deeleeramone opened this issue Apr 7, 2023 · 9 comments
Closed
Tracked by #4741 ...
Labels
bug Fix bug

Comments

@deeleeramone
Copy link
Contributor

deeleeramone commented Apr 7, 2023

If you load one of the sample data sets in the Econometrics menu, for this example it is macrodata, the column for year is formatted as an abbreviated integer in thousands.

Screenshot 2023-04-07 at 12 35 09 PM

type macrodata.year --format date then converts them all to 0.000.
Screenshot 2023-04-07 at 12 36 22 PM

It would be a reasonable argument to make that type --format date is not sufficient for converting these values. This should convert to a DatetimeIndex and allow for the desired formatting of the date and time.

With the inability to correct this, you can't set the index and it essentially becomes impossible to do any meaningful analysis or calculations. Furthermore, upon loading any data set, it should automatically find the column titled, date or year/month and format the DataFrame appropriately.

@deeleeramone deeleeramone added the bug Fix bug label Apr 7, 2023
@the-praxs
Copy link
Contributor

the-praxs commented Apr 11, 2023

I am trying to fix it by adding the following code in helper_funcs.py:

for col in df_outgoing.columns:
      if col == "":
          df_outgoing = df_outgoing.rename(columns={col: "  "})
      # My code starts here
      if col.lower() == "year":
          df_outgoing[col] = df_outgoing[col].astype(int)
      if col.lower() in ("period", "date"):
          df_outgoing[col] = pd.to_datetime(
              df_outgoing[col], format="%Y-%m-%d"
          )

This work when I use console.print(df_outgoing) to output the resulting dataframe
image

However there is no change in the output window
image

Any clue about this behavior of the print_rich_table function?

@tehcoderer
Copy link
Contributor

tehcoderer commented Apr 12, 2023

change type to string on year column

@the-praxs
Copy link
Contributor

@tehcoderer changing year to str will show 1960.0 instead of our desired output 1960.

@the-praxs
Copy link
Contributor

Issue extends to the datetime type index

image

@deeleeramone
Copy link
Contributor Author

@tehcoderer changing year to str will show 1960.0 instead of our desired output 1960.

Convert the date to a string. An integer does not have a decimal, so you must have converted to a float first.

@the-praxs
Copy link
Contributor

the-praxs commented Apr 12, 2023

@tehcoderer changing year to str will show 1960.0 instead of our desired output 1960.

Convert the date to a string. An integer does not have a decimal, so you must have converted to a float first.

The dataset was loaded with float64 dtype by default. I tried this approach but found a more direct and suitable way to fix it. Please feel free to review the commit (#4742). It works as intended on my end.

If you find any more issues, please ping me and I will fix them as soon as I can.

@tehcoderer
Copy link
Contributor

Fixed in #4848 👍

@tehcoderer
Copy link
Contributor

@tehcoderer changing year to str will show 1960.0 instead of our desired output 1960.

yeah had to convert to int then str for it to work when I tested today lol.

@deeleeramone
Copy link
Contributor Author

Fixed in #4848

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fix bug
Projects
None yet
Development

No branches or pull requests

3 participants