Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37290: [MATLAB] Add arrow.array.Time32Array class #37315

Merged
merged 17 commits into from
Aug 23, 2023
Merged

Conversation

sgilmore10
Copy link
Member

@sgilmore10 sgilmore10 commented Aug 22, 2023

Rationale for this change

Now that the arrow.type.Time32Type class has been added to the MATLAB Interface (#37250), we can add the arrow.array.Time32Array class.

What changes are included in this PR?

  1. Added a new class arrow.array.Time32Array. It has the following read-only properties: Type and Length. It has the following methods: fromMATLAB(), toMATLAB(), and duration(). For reference, duration arrays in MATLAB represent lengths of time in fixed-length units. Here's a link to the documentation.
  2. Added a new class called arrow.type.traits.Time32Traits.

Example Usage

>> times = seconds([100 120 NaN 200])

times = 

  1×4 duration array

   100 sec   120 sec   NaN sec   200 sec

>> time32Array = arrow.array.Time32Array.fromMATLAB(times)

time32Array = 

[
  00:01:40,
  00:02:00,
  null,
  00:03:20
]

>> roundtrip = toMATLAB(time32Array)

roundtrip = 

  4×1 duration array

   100 sec
   120 sec
   NaN sec
   200 sec

You can also specify the TimeUnit to use via a name-value pair. TimeUnit can either be "Second" (default) or "Milisecond".

>> times = seconds([100 120 NaN 200]);
>> time32Array = arrow.array.Time32Array.fromMATLAB(times, TimeUnit="Millisecond")

time32Array = 

[
  00:01:40.000,
  00:02:00.000,
  null,
  00:03:20.000
]

Are these changes tested?

Yes. Added two new test classes: tTime32Array.m and tTime32Traits.m.

Are there any user-facing changes?

Yes, users can now create arrow.array.Time32Arrays from MATLAB durations via the static fromMATLAB method on arrow.array.Time32Array.

Future Directions

  1. Add arrow.array.Time64Array
  2. Add arrow.type.Date32Type and arrow.array.Date32Array
  3. Add arrow.type.Date64Type and arrow.array.Date64Array

@sgilmore10 sgilmore10 marked this pull request as ready for review August 22, 2023 18:08
Copy link
Member

@kevingurney kevingurney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall - thank you!

matlab/src/cpp/arrow/matlab/array/proxy/time32_array.cc Outdated Show resolved Hide resolved
matlab/src/cpp/arrow/matlab/array/proxy/time32_array.cc Outdated Show resolved Hide resolved
matlab/src/cpp/arrow/matlab/array/proxy/time32_array.cc Outdated Show resolved Hide resolved
matlab/src/cpp/arrow/matlab/array/proxy/time32_array.cc Outdated Show resolved Hide resolved
matlab/src/cpp/arrow/matlab/array/proxy/time32_array.cc Outdated Show resolved Hide resolved
matlab/src/matlab/+arrow/+array/Time32Array.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTime32Array.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTime32Array.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTime32Array.m Outdated Show resolved Hide resolved
matlab/test/arrow/array/tTime32Array.m Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Aug 22, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Aug 22, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit 0781f13 into apache:main Aug 23, 2023
8 of 9 checks passed
@kou kou removed the awaiting change review Awaiting change review label Aug 23, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 0781f13.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

kevingurney pushed a commit that referenced this pull request Aug 23, 2023
…dBatch` errors if `index` refers to an `arrow.array.Time32Array` column (#37347)

### Rationale for this change

The `column(index)` method of `arrow.tabular.RecordBatch` errors  if `index` refers to an `arrow.array.Time32Array` column.

```matlab
>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);
```

`column(index)` is throwing this error because the case for `Time32` was not added to `arrow::matlab::array::proxy::wrap(arrowArray)` in #37315. `column(index)` calls this function to create proxy objects around existing arrow arrays. Adding the case for `Time32` will resolve this bug.

### What changes are included in this PR?

1. Updated `arrow::Result<proxy::Array> wrap(const std::shared_ptr<arrow::Array>& array)` to handle wrapping `arrow::Time32Array` instances within `proxy::Time32Array`s.
2. Added a new test utility called `arrow.internal.test.tabular.createAllSupportedArrayTypes`, which returns two cell arrays: `arrowArrays` and `matlabData`. The `arrowArrays` cell array contains one instance of each concrete subclass of `arrow.array.Array`. The `matlabData` cell array contains the MATLAB arrays used to generate each array in `arrowArrays`. This utility can be used to create an `arrow.array.RecordBatch` that contains all the arrow array types that are supported by the MATLAB interface. 

### Are these changes tested?

Yes. Updated the `fromArrays` test cases in `tRecordBatch` to verify the `column(index)` method of `arrow.type.RecordBatch` supports extracting columns of any arrow type (supported by the MATLAB Interface).

### Are there any user-facing changes?

Yes. Fixed a bug in `arrow.tabular.RecordBatch`.

### Future Directions

1. #37345

* Closes: #37340

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
@kevingurney kevingurney deleted the GH-37290 branch August 30, 2023 00:41
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…37315)

### Rationale for this change

Now that the `arrow.type.Time32Type` class has been added to the MATLAB Interface (apache#37250), we can add the `arrow.array.Time32Array` class. 

### What changes are included in this PR?

1. Added a new class `arrow.array.Time32Array`. It has the following read-only properties: `Type` and `Length`. It has the following methods: `fromMATLAB()`, `toMATLAB()`, and `duration()`. For reference, `duration` arrays in MATLAB represent lengths of time in fixed-length units. Here's a [link](https://www.mathworks.com/help/matlab/ref/duration.html) to the documentation. 
2. Added a new class called `arrow.type.traits.Time32Traits`.

**Example Usage**

```matlab
>> times = seconds([100 120 NaN 200])

times = 

  1×4 duration array

   100 sec   120 sec   NaN sec   200 sec

>> time32Array = arrow.array.Time32Array.fromMATLAB(times)

time32Array = 

[
  00:01:40,
  00:02:00,
  null,
  00:03:20
]

>> roundtrip = toMATLAB(time32Array)

roundtrip = 

  4×1 duration array

   100 sec
   120 sec
   NaN sec
   200 sec

```

You can also specify the `TimeUnit` to use via a name-value pair. `TimeUnit` can either be `"Second"`  (default) or `"Milisecond"`.

```matlab
>> times = seconds([100 120 NaN 200]);
>> time32Array = arrow.array.Time32Array.fromMATLAB(times, TimeUnit="Millisecond")

time32Array = 

[
  00:01:40.000,
  00:02:00.000,
  null,
  00:03:20.000
]
```

### Are these changes tested?

Yes. Added two new test classes: `tTime32Array.m` and `tTime32Traits.m`.

### Are there any user-facing changes?

Yes, users can now create `arrow.array.Time32Array`s from MATLAB `duration`s via the static `fromMATLAB` method on `arrow.array.Time32Array`.

### Future Directions

1. Add `arrow.array.Time64Array`
2. Add `arrow.type.Date32Type` and `arrow.array.Date32Array`
4. Add `arrow.type.Date64Type` and `arrow.array.Date64Array`

* Closes: apache#37290

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
….RecordBatch` errors if `index` refers to an `arrow.array.Time32Array` column (apache#37347)

### Rationale for this change

The `column(index)` method of `arrow.tabular.RecordBatch` errors  if `index` refers to an `arrow.array.Time32Array` column.

```matlab
>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);
```

`column(index)` is throwing this error because the case for `Time32` was not added to `arrow::matlab::array::proxy::wrap(arrowArray)` in apache#37315. `column(index)` calls this function to create proxy objects around existing arrow arrays. Adding the case for `Time32` will resolve this bug.

### What changes are included in this PR?

1. Updated `arrow::Result<proxy::Array> wrap(const std::shared_ptr<arrow::Array>& array)` to handle wrapping `arrow::Time32Array` instances within `proxy::Time32Array`s.
2. Added a new test utility called `arrow.internal.test.tabular.createAllSupportedArrayTypes`, which returns two cell arrays: `arrowArrays` and `matlabData`. The `arrowArrays` cell array contains one instance of each concrete subclass of `arrow.array.Array`. The `matlabData` cell array contains the MATLAB arrays used to generate each array in `arrowArrays`. This utility can be used to create an `arrow.array.RecordBatch` that contains all the arrow array types that are supported by the MATLAB interface. 

### Are these changes tested?

Yes. Updated the `fromArrays` test cases in `tRecordBatch` to verify the `column(index)` method of `arrow.type.RecordBatch` supports extracting columns of any arrow type (supported by the MATLAB Interface).

### Are there any user-facing changes?

Yes. Fixed a bug in `arrow.tabular.RecordBatch`.

### Future Directions

1. apache#37345

* Closes: apache#37340

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MATLAB] Add arrow.array.Time32Array class
3 participants