Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataframe csv datetime #5834

Merged
merged 6 commits into from
Jun 4, 2021
Merged

Dataframe csv datetime #5834

merged 6 commits into from
Jun 4, 2021

Conversation

pgovind
Copy link

@pgovind pgovind commented Jun 3, 2021

Clean up the remaining files from #5791

cc @derekdiamond

@pgovind
Copy link
Author

pgovind commented Jun 3, 2021

Can I get a review/approval for this PR please? It's just a port of #5791 (which I reviewed and was almost ready to go in) with the out artifacts cleaned up

@pgovind pgovind added the Microsoft.Data.Analysis All DataFrame related issues and PRs label Jun 3, 2021

public void CumulativeMax(PrimitiveColumnContainer<DateTime> column)
{
var ret = column.Buffers[0].ReadOnlySpan[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it is empty?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought when I first saw the PR, so I looked at what the other columns are doing. None of them check for empty here. It's not high priority IMO, so I'm thinking we can fix that for all the columns in a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you log an issue for this? So we remember to do it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -511,7 +511,15 @@ public DataFrame Append(IEnumerable<object> row = null, bool inPlace = false)
}
if (value != null)
{
value = Convert.ChangeType(value, column.DataType);
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is this try-catch? It is inside a loop, so it may effect perf.

@codecov
Copy link

codecov bot commented Jun 3, 2021

Codecov Report

Merging #5834 (bc636d9) into main (43c49f6) will increase coverage by 0.00%.
The diff coverage is 58.58%.

@@           Coverage Diff            @@
##             main    #5834    +/-   ##
========================================
  Coverage   68.35%   68.36%            
========================================
  Files        1131     1134     +3     
  Lines      241210   241856   +646     
  Branches    25039    25110    +71     
========================================
+ Hits       164887   165333   +446     
- Misses      69819    70006   +187     
- Partials     6504     6517    +13     
Flag Coverage Δ
Debug 68.36% <58.58%> (+<0.01%) ⬆️
production 62.94% <26.42%> (-0.03%) ⬇️
test 89.28% <94.54%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Microsoft.Data.Analysis/Strings.Designer.cs 44.20% <ø> (+1.23%) ⬆️
src/Microsoft.Data.Analysis/DateTimeComputation.cs 22.60% <22.60%> (ø)
...a.Analysis/PrimitiveDataFrameColumnComputations.cs 45.69% <66.66%> (-0.01%) ⬇️
src/Microsoft.Data.Analysis/DataFrame.IO.cs 78.96% <83.33%> (+0.84%) ⬆️
...st/Microsoft.Data.Analysis.Tests/DataFrameTests.cs 99.43% <92.71%> (-0.52%) ⬇️
...Microsoft.Data.Analysis.Tests/DataFrame.IOTests.cs 98.52% <98.55%> (+0.14%) ⬆️
src/Microsoft.Data.Analysis/DataFrame.cs 86.68% <100.00%> (+0.28%) ⬆️
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0.00%> (-20.52%) ⬇️
src/Microsoft.ML.Core/Data/ProgressReporter.cs 70.95% <0.00%> (-6.99%) ⬇️
src/Microsoft.ML.FastTree/FastTreeRanking.cs 50.79% <0.00%> (-4.28%) ⬇️
... and 36 more

{
value = Convert.ChangeType(value, column.DataType);
}
catch (Exception ex)
Copy link
Author

@pgovind pgovind Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be catching FormatException. Or rather, maybe Append's caller should catch this exception

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'm going to rip this try-catch out of this PR. We can add it later if we want to. I'm more interested in getting the DateTime support in


public void CumulativeMax(PrimitiveColumnContainer<DateTime> column)
{
var ret = column.Buffers[0].ReadOnlySpan[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you log an issue for this? So we remember to do it.

test/Microsoft.Data.Analysis.Tests/DataFrameTests.cs Outdated Show resolved Hide resolved
@pgovind pgovind merged commit 8801e40 into dotnet:main Jun 4, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Mar 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Microsoft.Data.Analysis All DataFrame related issues and PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants