Commit b48e9b7
Merge Main into Feature/4.0 (#6747)
* Update build templates to handle feature branches (#6744)
* Update build templates
* Update build templates to include all releases/* and feature/*
* Update releases to release
* Update triggers for PR Validation Build
* Add triggers for Code Coverage
* Update version to 4.0 for feature branch (#6743)
* Add missing implementation for datetime relevant arrow type into dataframe (#6675)
* Add missing implementation for datetime relevant arrow type
* Return required usage
* Fix the behavior or column SetName method (#6676)
* Fix the behavior or column SetName method
* Fix stack overflow exception
* Fix merge issues
---------
Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
* Fix DataFrame to allow to store columns with size more than 2 Gb (#6710)
* Fix error with allocating more than MaxCapacity of Byte Memory Buffer
* Remove Unit test as it consumes too much memory
* Fix issue with increasing buffer capacity over limit when double it size
* avoid empty dataset (#6756)
* Fix dataframe arithmetics for columns having several value buffers (column size is more than 2 Gb) (#6724)
* Fix dataframe arithmetics
* Fix
* Run tests that requires more than 2 Gb of Memory only on 64-bit env (#6758)
* Reduce coupling of Data.Analysis.Tests project (#6759)
* Provide ability to filter dataframe column by null via ElementWise Methods (#6723)
* Provide ability to filter by null value
* Add comments
* Fix code review findings
* Fix incorrect DataFrame min max computation with NULL (#6734)
* Step 1
* Step 2
* Fixed code review findings
* Clean DataFrame meaningless code (#6761)
* Add NameEntityRecognition and Q&A deep learning tasks. (#6760)
* NER
* QA almost done, runtime error
* QA finished
* fixes from PR comments
* fixed build
* build fixes
* perf changes
* made disposable
* fixed not disposing model
* added some disposables to TensorFlow for memory
* build testing
* fixing build
* added missing dispose
* build fixes
* build fixes
* testing macos fix
* fix issue (#6768)
* fixed mac build and minor torch sharp changes (#6776)
* Improve DataFrame Arithmetics implementation (#6763)
* Change methods signature generation
* Change DataFrameColumn Arithmetics
* Change DataFrameColumn Operations
* Fix unit tests
* Fix spaces
* Fix code review findings
* Add QA sweepable estimator in AutoML (#6781)
* Add QA sweepable
* clean
* Modernized some argument checks that still used string literals for parameter names (#6766)
Co-authored-by: John Doe <john@doe>
* removed deprecated yosemite brew (#6805)
* Add TargetType to Type_convert (#6785)
* Add target Type in convert type
* Add custom type "DataKind"
* clean
* Add DataKind name space
* clean test
* File-scoped namespaces in files under `Environment` (`Microsoft.ML.Core`) (#6791)
Co-authored-by: Lehonti Ramos <john@doe>
* File-scoped namespaces in files under `EntryPoints` (`Microsoft.ML.Core`) (#6790)
Co-authored-by: Lehonti Ramos <john@doe>
* Fix issue with addIndexColumn in DataFrame.LoadCsv (#6769)
* Fix issue with addIndexColumn in DataFrame.LoadCsv
* Fix tests
* Fix DataFrame.LoadCsv can not load CSV with duplicate column names (#6772)
* File-scoped namespaces in files under `ComponentModel` (`Microsoft.ML.Core`) (#6788)
Co-authored-by: Lehonti Ramos <john@doe>
* File-scoped namespaces in files under `Data` (`Microsoft.ML.Core`) (#6789)
Co-authored-by: Lehonti Ramos <john@doe>
* Fix inconsistent null handling in DataFrame Arithmetics (#6770)
* Fix inconsistent null handling in DataFrame Arithmetics
* Fix Null Count and division by zero issues
* Minor changes to restart build and rerun flaky tests
* File-scoped namespaces in files under `Prediction` (`Microsoft.ML.Core`) (#6792)
Co-authored-by: Lehonti Ramos <john@doe>
* Allow to define CultureInfo for parsing values on reading DataFrame from csv (#6782)
* Use CultureInfo for parsing values in csv file
* Fix merge issues
* Append dataframe rows based on column names (#6808)
* Append dataframe rows based on column names
* Update DataFrame.cs
---------
Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
* removed codecov token (#6811)
* Fix wrong type conversion on PrimitiveDataFrameColumn (#6834)
* Fix wrong type conversion on PrimitiveDataFrameColumn
* Added tests for #6829
* Fix test
* Add file generated from tt template and fix unit tests
---------
Co-authored-by: Aleksei Smirnov <tlalok@inbox.ru>
* update interactive kernel version (#6836)
* update interactive kernel version
* update
* Update Microsoft.Data.Analysis.Interactive.Tests.csproj
* Add performance benchmarks for dataframe arithmetic operations (#6827)
* Add performance tests
* Add extra tests
* Fix
* Fix typo
* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks
* Fix
* Change csproj file
* Update BenchmarkDotNetVersion to 0.13.5
* Fix
* Change to 0.13.1 because that is what is latest version in our nuget feeds.
---------
Co-authored-by: Jake Radzikowski <JakeRad@Microsoft.com>
* Improve performance of column cloning inside DataFrame arithmetics (#6814)
* Optimize PrimitiveColumnContainer.Clone method
* Avoid unnecessary type conversion during binary operations
* Remove using
* Fix DataFrameBuffer constructor
* remove uncorrectly added using
* Make DataFrameBuffer Length field protected
* Fix typo
* Use RawSpan
* Simplify tt files for PrimitiveDataFrameColumnAritmetics (#6830)
* First step of tt refactoring
* Step 2
* Step 3
* Addresses #6533 (#6838)
* Initial structure and started fleshing out some sections
* Some corrections and paragraph on DL usages
* Starting fleshing out DL on ML.NET section
* Addresses #6533
* Update dependencies (#6837)
* Update dependencies
* Add reference to NuGet.Packaging.Core
* PrimitiveDataFrameColumn.Clone method crashes when is used with IEnumerable mapIndices argument (#6822)
* Split Test for AppendMany into 4 different tests
* Block init of null validity buffer instead of setting individual bits
* Add unit tests for PrimitiveDataFrameColumn.Clone
* Fixes #6821
* Fix
* Fix bug with AppendMany values to not empty column
* Restart unit tests
* Add more unit tests
* Fix failing unit test
* Fix code review findings
* 6847 incorrectly sets column value (#6849)
* Fix DataFrame incorrectly sets column value for index higher than Buffer.MaxCapacity
* Revert renaming
* Increase performance of arithmetic operations by enhancing calculations on nullable values (#6846)
* Optimize PrimitiveColumnContainer.Clone method
* Avoid unnecessary type conversion during binary operations
* Remove using
* Fix DataFrameBuffer constructor
* remove uncorrectly added using
* Make DataFrameBuffer Length field protected
* Add performance tests
* Split Test for AppendMany into 4 different tests
* Block init of null validity buffer instead of setting individual bits
* Add unit tests for PrimitiveDataFrameColumn.Clone
* Fixes #6821
* Fix
* Add extra tests
* Fix
* Fix typo
* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks
* Fix
* Avoid using constructor, that copies memory
* First step of tt refactoring
* Step 2
* Step 3
* Move iteration over buffers outside of the PrimitiveDataFrameColumnArithmetic
* Change PrimitiveDataFrameColumnArithmetic
* Fix typo
* Use RawSpan
* Fix bug with AppendMany values to not empty column
* Restart unit tests
* Add more unit tests
* Add GetBitCount method
* Fix failing unit test
* Implementation
* Change unit tests
* Update unit tests
* Refactoring BinaryOperation
* Intermediate changes
* Intermediate results
* Implement Binary Scalar Reverse Operarions
* Add implementation for BinaryIntOperations
* Implement Comparison Operations
* Implement actual calculations for Comparison operations
* Uncomment performance tests
* Remove unintentional code changes
* Add reference to Apache Arrow project license in THIRD-PARTY-NOTICES
* Fix license issues
* Fixes incorrect work of DataFrame with VBufferColumn when number of e… (#6851)
* Fixes incorrect work of DataFrame with VBufferColumn when number of elements is greater than Int.MaxValue
* Fix calculation of max capacity and amount of required buffers
* Fix unit test
* Run test allocating more than 2 Gb of memory on 64bit env only
* Fix StringDataFrameColumn same way as VBufferDataFrameColumn
* Fix wrong amount of buffers created in constructor of StringDataFrameColumn
* Fix code review findings
---------
Co-authored-by: Aleksei Smirnov <tlalok@inbox.ru>
Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Co-authored-by: Xiaoyun Zhang <xiaoyuz@microsoft.com>
Co-authored-by: zewditu Hailemariam <36615490+zewditu@users.noreply.github.com>
Co-authored-by: Lehonti Ramos <17771375+Lehonti@users.noreply.github.com>
Co-authored-by: John Doe <john@doe>
Co-authored-by: Raffaello Fraboni <10281615+novelhawk@users.noreply.github.com>
Co-authored-by: R. G. Esteves <rodolfo.g.esteves@intel.com>
Co-authored-by: Eric StJohn <ericstj@microsoft.com>1 parent e7099bf commit b48e9b7
File tree
191 files changed
+109874
-25339
lines changed- build
- ci
- eng
- src
- Microsoft.Data.Analysis.Interactive
- Microsoft.Data.Analysis
- Microsoft.ML.AutoML
- AutoMLExperiment
- Runner
- CodeGen
- SweepableEstimator/Estimators
- Microsoft.ML.Core
- ComponentModel
- Data
- EntryPoints
- Environment
- Prediction
- Microsoft.ML.Mkl.Components
- Microsoft.ML.OneDal
- Microsoft.ML.StandardTrainers/Standard/LogisticRegression
- Microsoft.ML.Tokenizers
- Model
- Microsoft.ML.TorchSharp
- AutoFormerV2
- NasBert
- Models
- Roberta
- Models
- Modules
- Utils
- Native/OneDalNative
- test
- Microsoft.Data.Analysis.Interactive.Tests
- Microsoft.Data.Analysis.PerformanceTests
- Microsoft.Data.Analysis.Tests
- Microsoft.ML.AutoML.Tests/ApprovalTests
- Microsoft.ML.Benchmarks.Tests
- Microsoft.ML.CodeAnalyzer.Tests
- Microsoft.ML.CpuMath.PerformanceTests
- Microsoft.ML.Fairlearn.Tests
- Microsoft.ML.IntegrationTests/Datasets
- Microsoft.ML.PerformanceTests/Harness
- Microsoft.ML.TestFrameworkCommon/Attributes
- Microsoft.ML.TestFramework/DataPipe
- Microsoft.ML.Tests
- ScenariosWithDirectInstantiation
- Microsoft.ML.Tokenizers.Tests
- data
- tools-local/Microsoft.ML.AutoML.SourceGenerator
- Template
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
191 files changed
+109874
-25339
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
5 | 19 | | |
6 | 20 | | |
7 | 21 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
168 | 168 | | |
169 | 169 | | |
170 | 170 | | |
| 171 | + | |
| 172 | + | |
171 | 173 | | |
172 | 174 | | |
173 | 175 | | |
| |||
788 | 790 | | |
789 | 791 | | |
790 | 792 | | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
791 | 801 | | |
792 | 802 | | |
793 | 803 | | |
| |||
870 | 880 | | |
871 | 881 | | |
872 | 882 | | |
| 883 | + | |
873 | 884 | | |
874 | 885 | | |
875 | 886 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | | - | |
20 | | - | |
| 18 | + | |
| 19 | + | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | | - | |
20 | | - | |
| 18 | + | |
| 19 | + | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | | - | |
48 | 47 | | |
49 | 48 | | |
50 | 49 | | |
| |||
68 | 67 | | |
69 | 68 | | |
70 | 69 | | |
71 | | - | |
| 70 | + | |
72 | 71 | | |
73 | 72 | | |
74 | 73 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
5 | 19 | | |
6 | 20 | | |
7 | 21 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | 18 | | |
20 | 19 | | |
21 | 20 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
103 | | - | |
| 103 | + | |
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| |||
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
33 | | - | |
| 35 | + | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
38 | | - | |
39 | | - | |
| 40 | + | |
| 41 | + | |
40 | 42 | | |
41 | 43 | | |
42 | 44 | | |
| |||
74 | 76 | | |
75 | 77 | | |
76 | 78 | | |
77 | | - | |
| 79 | + | |
78 | 80 | | |
79 | 81 | | |
80 | 82 | | |
81 | | - | |
| 83 | + | |
82 | 84 | | |
83 | 85 | | |
84 | 86 | | |
| |||
87 | 89 | | |
88 | 90 | | |
89 | 91 | | |
90 | | - | |
| 92 | + | |
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
| |||
0 commit comments