Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Main into Feature/4.0 #6747

Merged
merged 43 commits into from
Oct 5, 2023

Conversation

JakeRadMSFT
Copy link
Contributor

@JakeRadMSFT JakeRadMSFT commented Jun 28, 2023

Merging Latest Main into Feature/4.0 branch

* Update build templates

* Update build templates to include all releases/* and feature/*

* Update releases to release

* Update triggers for PR Validation Build

* Add triggers for Code Coverage
@ghost ghost assigned JakeRadMSFT Jun 28, 2023
@JakeRadMSFT JakeRadMSFT changed the base branch from main to feature/4.0 June 28, 2023 19:09
@codecov
Copy link

codecov bot commented Jun 28, 2023

Codecov Report

Merging #6747 (e72e985) into feature/4.0 (8858ab6) will increase coverage by 0.75%.
Report is 1 commits behind head on feature/4.0.
The diff coverage is 78.60%.

@@               Coverage Diff               @@
##           feature/4.0    #6747      +/-   ##
===============================================
+ Coverage        68.89%   69.64%   +0.75%     
===============================================
  Files             1216     1237      +21     
  Lines           250915   247617    -3298     
  Branches         26259    25436     -823     
===============================================
- Hits            172857   172444     -413     
+ Misses           71238    68561    -2677     
+ Partials          6820     6612     -208     
Flag Coverage Δ
Debug 69.64% <78.60%> (+0.75%) ⬆️
production 64.19% <78.60%> (+0.79%) ⬆️
test 88.89% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...rosoft.Data.Analysis/ArrowStringDataFrameColumn.cs 63.54% <100.00%> (ø)
src/Microsoft.Data.Analysis/DataFrame.Arrow.cs 98.46% <100.00%> (+0.08%) ⬆️
src/Microsoft.Data.Analysis/DataFrame.Join.cs 94.87% <100.00%> (ø)
src/Microsoft.Data.Analysis/DataFrame.cs 86.73% <100.00%> (+0.27%) ⬆️
src/Microsoft.Data.Analysis/DataFrameRow.cs 69.44% <100.00%> (+2.77%) ⬆️
src/Microsoft.Data.Analysis/DateTimeComputation.cs 39.16% <100.00%> (+2.64%) ⬆️
...lysis/PrimitiveColumnContainer.BinaryOperations.cs 100.00% <100.00%> (+15.78%) ⬆️
...t.Data.Analysis/PrimitiveColumnContainerHelpers.cs 100.00% <100.00%> (ø)
...FrameColumn.BinaryOperationAPIs.ExplodedColumns.cs 7.69% <ø> (+0.09%) ⬆️
...eColumn.BinaryOperationImplementations.Exploded.cs 76.47% <ø> (+28.81%) ⬆️
... and 114 more

... and 17 files with indirect coverage changes

@JakeRadMSFT JakeRadMSFT reopened this Jun 28, 2023
asmirnov82 and others added 23 commits July 6, 2023 12:15
…frame (dotnet#6675)

* Add missing implementation for datetime relevant arrow type

* Return required usage
* Fix the behavior or column SetName method

* Fix stack overflow exception

* Fix merge issues

---------

Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
…net#6710)

* Fix error with allocating more than MaxCapacity of Byte Memory Buffer

* Remove Unit test as it consumes too much memory

* Fix issue with increasing buffer capacity over limit when double it size
…olumn size is more than 2 Gb) (dotnet#6724)

* Fix dataframe arithmetics

* Fix
…thods (dotnet#6723)

* Provide ability to filter by null value

* Add comments

* Fix code review findings
* Step 1

* Step 2

* Fixed code review findings
* NER

* QA almost done, runtime error

* QA finished

* fixes from PR comments

* fixed build

* build fixes

* perf changes

* made disposable

* fixed not disposing model

* added some disposables to TensorFlow for memory

* build testing

* fixing build

* added missing dispose

* build fixes

* build fixes

* testing macos fix
* Change methods signature generation

* Change DataFrameColumn Arithmetics

* Change DataFrameColumn Operations

* Fix unit tests

* Fix spaces

* Fix code review findings
…arameter names (dotnet#6766)

Co-authored-by: John Doe <john@doe>
* Add target Type in convert  type

* Add custom type "DataKind"

* clean

* Add DataKind name space

* clean test
* Fix issue with addIndexColumn in DataFrame.LoadCsv

* Fix tests
….Core`) (dotnet#6788)

Co-authored-by: Lehonti Ramos <john@doe>
Lehonti and others added 4 commits August 30, 2023 22:21
* Fix inconsistent null handling in DataFrame Arithmetics

* Fix Null Count and division by zero issues

* Minor changes to restart build and rerun flaky tests
…rom csv (dotnet#6782)

* Use CultureInfo for parsing values in csv file

* Fix merge issues
@michaelgsharp
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

asmirnov82 and others added 14 commits September 1, 2023 22:06
* Append dataframe rows based on column names

* Update DataFrame.cs

---------

Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
* Fix wrong type conversion on PrimitiveDataFrameColumn

* Added tests for dotnet#6829

* Fix test

* Add file generated from tt template and fix unit tests

---------

Co-authored-by: Aleksei Smirnov <tlalok@inbox.ru>
* update interactive kernel version

* update

* Update Microsoft.Data.Analysis.Interactive.Tests.csproj
…t#6827)

* Add performance tests

* Add extra tests

* Fix

* Fix typo

* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks

* Fix

* Change csproj file

* Update BenchmarkDotNetVersion to 0.13.5

* Fix

* Change to 0.13.1 because that is what is latest version in our nuget feeds.

---------

Co-authored-by: Jake Radzikowski <JakeRad@Microsoft.com>
…otnet#6814)

* Optimize PrimitiveColumnContainer.Clone method

* Avoid unnecessary type conversion during binary operations

* Remove using

* Fix DataFrameBuffer constructor

* remove uncorrectly added using

* Make DataFrameBuffer Length field protected

* Fix typo

* Use RawSpan
* First step of tt refactoring

* Step 2

* Step 3
* Initial structure and started fleshing out some sections

* Some corrections and paragraph on DL usages

* Starting fleshing out DL on ML.NET section

* Addresses dotnet#6533
* Update dependencies

* Add reference to NuGet.Packaging.Core
…erable mapIndices argument (dotnet#6822)

* Split Test for AppendMany into 4 different tests

* Block init of null validity buffer instead of setting individual bits

* Add unit tests for PrimitiveDataFrameColumn.Clone

* Fixes dotnet#6821

* Fix

* Fix bug with AppendMany values to not empty column

* Restart unit tests

* Add more unit tests

* Fix failing unit test

* Fix code review findings
* Fix DataFrame incorrectly sets column value for index higher than Buffer.MaxCapacity

* Revert renaming
…ns on nullable values (dotnet#6846)

* Optimize PrimitiveColumnContainer.Clone method

* Avoid unnecessary type conversion during binary operations

* Remove using

* Fix DataFrameBuffer constructor

* remove uncorrectly added using

* Make DataFrameBuffer Length field protected

* Add performance tests

* Split Test for AppendMany into 4 different tests

* Block init of null validity buffer instead of setting individual bits

* Add unit tests for PrimitiveDataFrameColumn.Clone

* Fixes dotnet#6821

* Fix

* Add extra tests

* Fix

* Fix typo

* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks

* Fix

* Avoid using constructor, that copies memory

* First step of tt refactoring

* Step 2

* Step 3

* Move iteration over buffers outside of the PrimitiveDataFrameColumnArithmetic

* Change PrimitiveDataFrameColumnArithmetic

* Fix typo

* Use RawSpan

* Fix bug with AppendMany values to not empty column

* Restart unit tests

* Add more unit tests

* Add GetBitCount method

* Fix failing unit test

* Implementation

* Change unit tests

* Update unit tests

* Refactoring BinaryOperation

* Intermediate changes

* Intermediate results

* Implement Binary Scalar Reverse Operarions

* Add implementation for BinaryIntOperations

* Implement Comparison Operations

* Implement actual calculations for Comparison operations

* Uncomment performance tests

* Remove unintentional code changes

* Add reference to Apache Arrow project license in THIRD-PARTY-NOTICES

* Fix license issues
dotnet#6851)

* Fixes incorrect work of DataFrame with VBufferColumn when number of elements is greater than Int.MaxValue

* Fix calculation of max capacity and amount of required buffers

* Fix unit test

* Run test allocating more than 2 Gb of memory on 64bit env only

* Fix StringDataFrameColumn same way as VBufferDataFrameColumn

* Fix wrong amount of buffers created in constructor of StringDataFrameColumn

* Fix code review findings
Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not reviewing this commit-by-commit since it is a merge. I trust that the process here was "normal" please call out if you actually wanted eyes on a specific part of this, otherwise merge away.

@JakeRadMSFT JakeRadMSFT merged commit b48e9b7 into dotnet:feature/4.0 Oct 5, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Nov 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants