Skip to content

Conversation

@TheNeuralBit
Copy link
Member

@TheNeuralBit TheNeuralBit commented May 29, 2017

Added tslint config and npm run tslint script. Modified current code to pass lint tests.

Currently the config disables the bitwise operation and max classes per file checks. I also ignored the long line test for all of the nullable primitive vectors, since that should be replaced anyway in the near future.

@wesm
Copy link
Member

wesm commented May 31, 2017

Can you rebase on master? Sorry, we force pushed to include the release tag and GitHub doesn't like it

@TheNeuralBit
Copy link
Member Author

@wesm - no problem, I just rebased

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@asfgit asfgit closed this in 092afb6 Jun 2, 2017
jeffknupp pushed a commit to jeffknupp/arrow that referenced this pull request Jun 3, 2017
Added tslint config and `npm run tslint` script. Modified current code to pass lint tests.

Currently the config disables the bitwise operation and max classes per file checks. I also ignored the long line test for all of the nullable primitive vectors, since that should be replaced anyway in the near future.

Author: Brian Hulette <brian.hulette@ccri.com>

Closes apache#718 from TheNeuralBit/tslint and squashes the following commits:

727d80f [Brian Hulette] added npm lint script
68b7c8f [Brian Hulette] misc tslint fixes
6f1583e [Brian Hulette] sort object literals
67d82ca [Brian Hulette] variable names, object shorthand
0a1b872 [Brian Hulette] fix public, private, protected ordering issues
08d60e6 [Brian Hulette] quotes, equality checks
c5d85f7 [Brian Hulette] whitespace, semicolons, Errors
2b4ff28 [Brian Hulette] added public, private, protected to all members
81c9868 [Brian Hulette] Replace vars with let/const, one def per line
d591b3c [Brian Hulette] add tslint config
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
## What's Changed

This PR relates to apache#698 and is the second in a series intended to
provide full Avro read / write support in native Java. It adds
round-trip tests for both schemas (Arrow schema -> Avro -> Arrow) and
data (Arrow VSR -> Avro block -> Arrow VSR). It also adds a number of
fixes and improvements to the Avro Consumers so that data arrives back
in its original form after a round trip. The main changes are:

* Added a top level method in AvroToArrow to convert Avro schema
directly to Arrow schema (this may exist elsewhere, but is needed to
provide an API that matches the logic of this implementation)
* Avro unions of [ type, null ] or [ null, type ] now have special
handling, these are interpreted as a single nullable type rather than a
union. Setting legacyMode = false in the AvroToArrowConfig object is
required to enable this behaviour, otherwise unions are interpreted
literally. Unions with more than 2 elements are always interpreted
literally (but, per apache#108, in practice Java's current Union
implementation is probably not usable with Avro atm).
* Added support for new logical types (decimal 256, timestamp nano and 3
local timestamp types)
* Existing timestamp-mills and timestamp-micros times now interpreted as
zone-aware (previously they were interpreted as local, but now the local
timestamp types are interpreted as local - I think this is correct per
the [Avro
spec](https://avro.apache.org/docs/1.12.0/specification/#timestamps)).
Requires setting legacyMode = false.
* Removed namespaces from generated Arrow field names in complex types.
E.g. the Avro field myNamepsace.outerRecord.structField.intField should
be called just "intField" inside the Arrow struct. This doesn't affect
the skip field logic, which still works using the qualified names. This
requires setting legacyMode = false.
* Remove unexpected metadata in generated Arrow fields (empty alias
lists and attributes interpreted as part of the field schema). This
requires setting legacyMode = false.
* Use the expected child vector names for Arrow LIST and MAP types when
reading. For LIST, the default child vector is called "$data$" which is
illegal in Avro, so the child field name is also changed to "item" in
the producers. This requires setting legacyMode = false.

Breaking changes have been removed from this PR.

Per discussion below, all breaking changes are now behind a "legacyMode"
flag in the AvroToArrowConfig object, which is enabled by default in all
the original code paths.

Closes apache#698 .

This change is meant to allow for round trip of schemas and individual
Avro data blocks (one Avro data block -> one VSR). File-level
capabilities are not included. I have not included anything to recycle
the VSR as part of the read API, this feels like it belongs with the
file-level piece. Also I have not done anything specific for enums /
dict encoding as of yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants