Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.10.x and stealth add v2 features #267

Open
wants to merge 33 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
3f32459
towards v1.10.0
tilo Dec 16, 2023
996ac26
change legacy behavior
tilo Dec 16, 2023
a831fe4
v1.10.0
tilo Dec 16, 2023
2a421ae
update
tilo Dec 16, 2023
0bf151e
test
tilo Dec 17, 2023
ce1c54d
Change in header de-duplication; Refactoring (#264)
tilo Dec 28, 2023
6a606e0
Merge branch 'main' into v1.10.0_branch
tilo Dec 28, 2023
503b5b7
restructure tests (#265)
tilo Dec 30, 2023
11a9590
added v2 header_transformations and :v2_mode option
tilo Dec 31, 2023
ba0be3a
update header_transformations; adding tests
tilo Dec 31, 2023
e476101
update
tilo Dec 31, 2023
5e2bd6d
calling header_validations from smarter_csv.rb
tilo Dec 31, 2023
f5d4f44
verbose output moved
tilo Dec 31, 2023
6f9ba69
small refactor
tilo Dec 31, 2023
42f260b
performance improvements
tilo Dec 31, 2023
b5d7f0b
adding Ruby 3.3
tilo Dec 31, 2023
25a2dab
merge main into branch
tilo Jan 1, 2024
61a2a18
cleanup
tilo Jan 2, 2024
7168f2c
improve header_validations
tilo Jan 2, 2024
a089f0e
improve key_mapping
tilo Jan 4, 2024
963f2b6
more tests for :key_mapping and :remove_unmapped_keys
tilo Jan 5, 2024
a26b5a8
Merge branch 'main' into v1.10_add_v2_features
tilo Jan 8, 2024
64ce70d
rubocop
tilo Jan 8, 2024
055ca09
tests for v2 header_validations
tilo Jan 8, 2024
5566dfa
v1 and v2 option handling; updated header and hash transformations
tilo Jan 10, 2024
66e6ac9
simplify
tilo Jan 10, 2024
7c1b501
update
tilo Jan 10, 2024
5aeff57
1.11.0.pre1
tilo Jan 13, 2024
7ec9774
improve count_quote_chars
tilo Jan 14, 2024
e04bb16
Merge branch 'main' into v1.10_add_v2_features
tilo Mar 2, 2024
0d09992
fix options processing (PR #273)
tilo Mar 9, 2024
a3f5003
rspec cleanup
tilo Apr 7, 2024
eb30e70
fix error
tilo Apr 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .rspec
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
--require spec_helper
12 changes: 12 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,18 @@ Style/IfInsideElse:
Style/IfUnlessModifier:
Enabled: false

Style/InverseMethods:
Enabled: false

Style/NestedTernaryOperator:
Enabled: false

Style/PreferredHashMethods:
Enabled: false

Style/Proc:
Enabled: false

Style/NumericPredicate:
Enabled: false

Expand Down Expand Up @@ -129,6 +135,9 @@ Style/SymbolProc: # old Ruby versions can't do this
Style/TrailingCommaInHashLiteral:
Enabled: false

Style/TrailingCommaInArrayLiteral:
Enabled: false

Style/TrailingUnderscoreVariable:
Enabled: false

Expand All @@ -138,6 +147,9 @@ Style/TrivialAccessors:
# Style/UnlessModifier:
# Enabled: false

Style/WordArray:
Enabled: false

Style/ZeroLengthPredicate:
Enabled: false

Expand Down
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,39 @@

# SmarterCSV 1.x Change Log

## T.B.D.

* code refactor

* NEW BEHAVIOR:
- hidden `:v2_mode` options (incomplete!)
- pre-processing for v2 options
- implemented v2 `:header_transformations` (DO NOT USE YET!)
+ -> check if all v1 transformations are correctly done
How are we going to
* disambiguate headers?


* do key_mapping? -> seems to work
- remove_unmapped_keys ?
- silence missing keys ... a missing mapped key should raise an exception, except when silenced
- required_keys needs to be a header-validation


* keep original headers? -> :none
* do strings_as_* ? -> either :keys_as_symbols, :keys_as_strings
* remove quote_chars? -> included in keys_as_*
* strip whitespace? -> included in keys_as_*

TODO:

- add tests for header_validations

- modify options to handle v1 and v2 options
- add v1 defaults in v2 processing
- add tests for all options processing
- 100% backwards compatibility when working in v1 mode

## 1.10.2 (2024-02-11)
* improve error message for missing keys

Expand Down
2 changes: 2 additions & 0 deletions lib/smarter_csv.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# frozen_string_literal: true

require 'set'

require "smarter_csv/version"
require "smarter_csv/file_io"
require "smarter_csv/options_processing"
Expand Down
160 changes: 120 additions & 40 deletions lib/smarter_csv/hash_transformations.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,16 @@

module SmarterCSV
class << self
# this is processing the headers from the input file
def hash_transformations(hash, options)
if options[:v2_mode]
hash_transformations_v2(hash, options)
else
hash_transformations_v1(hash, options)
end
end

def hash_transformations_v1(hash, options)
# there may be unmapped keys, or keys purposedly mapped to nil or an empty key..
# make sure we delete any key/value pairs from the hash, which the user wanted to delete:
remove_empty_values = options[:remove_empty_values] == true
Expand Down Expand Up @@ -33,46 +42,117 @@ def hash_transformations(hash, options)
end
end

# def hash_transformations(hash, options)
# # there may be unmapped keys, or keys purposedly mapped to nil or an empty key..
# # make sure we delete any key/value pairs from the hash, which the user wanted to delete:
# hash.delete(nil)
# hash.delete('')
# hash.delete(:"")

# if options[:remove_empty_values] == true
# hash.delete_if{|_k, v| has_rails ? v.blank? : blank?(v)}
# end

# hash.delete_if{|_k, v| !v.nil? && v =~ /^(0+|0+\.0+)$/} if options[:remove_zero_values] # values are Strings
# hash.delete_if{|_k, v| v =~ options[:remove_values_matching]} if options[:remove_values_matching]

# if options[:convert_values_to_numeric]
# hash.each do |k, v|
# # deal with the :only / :except options to :convert_values_to_numeric
# next if limit_execution_for_only_or_except(options, :convert_values_to_numeric, k)

# # convert if it's a numeric value:
# case v
# when /^[+-]?\d+\.\d+$/
# hash[k] = v.to_f
# when /^[+-]?\d+$/
# hash[k] = v.to_i
# end
# end
# end

# if options[:value_converters]
# hash.each do |k, v|
# converter = options[:value_converters][k]
# next unless converter

# hash[k] = converter.convert(v)
# end
# end

# hash
# end
def hash_transformations_v2(hash, options)
return hash if options[:hash_transformations].nil? || options[:hash_transformations].empty?

# do the header transformations the user requested:
if options[:hash_transformations]
options[:hash_transformations].each do |transformation|
if transformation.respond_to?(:call) # this is used when a user-provided Proc is passed in
hash = transformation.call(hash, options)
else
case transformation
when Symbol # this is used for pre-defined transformations that are defined in the SmarterCSV module
hash = public_send(transformation, hash, options)
when Hash # this is called for hash arguments, e.g. hash_transformations
trans, args = transformation.first # .first treats the hash first element as an array
hash = apply_transformation(trans, hash, args, options)
when Array # this can be used for passing additional arguments in array form (e.g. into a Proc)
trans, *args = transformation
hash = apply_transformation(trans, hash, args, options)
else
raise SmarterCSV::IncorrectOption, "Invalid transformation type: #{transformation.class}"
end
end
end
end

hash
end

#
# To handle v1-backward-compatible behavior, it is faster to roll all behavior into one method
#
def v1_backwards_compatibility(hash, options)
hash.each_with_object({}) do |(k, v), new_hash|
next if k.nil? || k == '' || k == :"" # remove_empty_keys
next if has_rails ? v.blank? : blank?(v) # remove_empty_values

# convert_values_to_numeric:
# deal with the :only / :except options to :convert_values_to_numeric
unless limit_execution_for_only_or_except(options, :convert_values_to_numeric, k)
if v =~ /^[+-]?\d+\.\d+$/
v = v.to_f
elsif v =~ /^[+-]?\d+$/
v = v.to_i
end
end

new_hash[k] = v
end
end

#
# Building Blocks in case you want to build your own flow:
#

def value_converters(hash, _options)
#
# TO BE IMPLEMENTED
#
end

def strip_spaces(hash, _options)
hash.each_key {|key| hash[key].strip! unless hash[key].nil? } # &. syntax was introduced in Ruby 2.3 - need to stay backwards compatible
end

def remove_blank_values(hash, _options)
hash.each_key {|key| hash.delete(key) if hash[key].nil? || hash[key].is_a?(String) && hash[key] !~ /[^[:space:]]/ }
end

def remove_zero_values(hash, _options)
hash.each_key {|key| hash.delete(key) if hash[key].is_a?(Numeric) && hash[key].zero? }
end

def remove_empty_keys(hash, _options)
hash.reject!{|key, _v| key.nil? || key.empty?}
end

def convert_values_to_numeric(hash, _options)
hash.each_key do |k|
case hash[k]
when /^[+-]?\d+\.\d+$/
hash[k] = hash[k].to_f
when /^[+-]?\d+$/
hash[k] = hash[k].to_i
end
end
end

def convert_values_to_numeric_unless_leading_zeroes(hash, _options)
hash.each_key do |k|
case hash[k]
when /^[+-]?[1-9]\d*\.\d+$/
hash[k] = hash[k].to_f
when /^[+-]?[1-9]\d*$/
hash[k] = hash[k].to_i
end
end
end

# IMPORTANT NOTE:
# this can lead to cases where a nil or empty value gets converted into 0 or 0.0,
# and can then not be properly removed!
#
# you should first try to use convert_values_to_numeric or convert_values_to_numeric_unless_leading_zeroes
#
def convert_to_integer(hash, _options)
hash.each_key {|key| hash[key] = hash[key].to_i }
end

def convert_to_float(hash, _options)
hash.each_key {|key| hash[key] = hash[key].to_f }
end

protected

Expand Down
104 changes: 103 additions & 1 deletion lib/smarter_csv/header_transformations.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,18 @@

module SmarterCSV
class << self
# transform the headers that were in the file:
# this is processing the headers from the input file
def header_transformations(header_array, options)
if options[:v2_mode]
header_transformations_v2(header_array, options)
else
header_transformations_v1(header_array, options)
end
end

# ---- V1.x Version: transform the headers that were in the file: ------------------------------------------
#
def header_transformations_v1(header_array, options)
header_array.map!{|x| x.gsub(%r/#{options[:quote_char]}/, '')}
header_array.map!{|x| x.strip} if options[:strip_whitespace]

Expand Down Expand Up @@ -57,7 +67,99 @@ def remap_headers(headers, options)
header
end
end

headers
end

# ---- V2.x Version: transform the headers that were in the file: ------------------------------------------
#
def header_transformations_v2(header_array, options)
return header_array if options[:header_transformations].nil? || options[:header_transformations].empty?

# do the header transformations the user requested:
if options[:header_transformations]
options[:header_transformations].each do |transformation|
if transformation.respond_to?(:call) # this is used when a user-provided Proc is passed in
header_array = transformation.call(header_array, options)
else
case transformation
when Symbol # this is used for pre-defined transformations that are defined in the SmarterCSV module
header_array = public_send(transformation, header_array, options)
when Hash # this is called for hash arguments, e.g. header_transformations
trans, args = transformation.first # .first treats the hash first element as an array
header_array = apply_transformation(trans, header_array, args, options)
when Array # this can be used for passing additional arguments in array form (e.g. into a Proc)
trans, *args = transformation
header_array = apply_transformation(trans, header_array, args, options)
else
raise SmarterCSV::IncorrectOption, "Invalid transformation type: #{transformation.class}"
end
end
end
end

header_array
end

def apply_transformation(transformation, header_array, args, options)
if transformation.respond_to?(:call)
# If transformation is a callable object (like a Proc)
transformation.call(header_array, args, options)
else
# If transformation is a symbol (method name)
public_send(transformation, header_array, args, options)
end
end

# pre-defined v2 header transformations:

# these are some pre-defined header transformations which can be used
# all these take the headers array as the input
#
# the computed options can be accessed via @options

def keys_as_symbols(headers, options)
headers.map do |header|
header.strip.downcase.gsub(%r{#{options[:quote_char]}}, '').gsub(/(\s|-)+/, '_').to_sym
end
end

def keys_as_strings(headers, options)
headers.map do |header|
header.strip.gsub(%r{#{options[:quote_char]}}, '').downcase.gsub(/(\s|-)+/, '_')
end
end

def downcase_headers(headers, _options)
headers.map do |header|
header.strip.downcase!
end
end

def key_mapping(headers, mapping = {}, options)
raise(SmarterCSV::IncorrectOption, "ERROR: incorrect format for key_mapping! Expecting hash with from -> to mappings") if mapping.empty? || !mapping.is_a?(Hash)

headers_set = headers.to_set
mapping_keys_set = mapping.keys.to_set
silence_keys_set = (options[:silence_missing_keys] || []).to_set

# Check for missing keys
missing_keys = mapping_keys_set - headers_set - silence_keys_set
raise SmarterCSV::KeyMappingError, "ERROR: cannot map headers: #{missing_keys.to_a.join(', ')}" if missing_keys.any? && !options[:silence_missing_keys]

# Apply key mapping, retaining nils for explicitly mapped headers
headers.map do |header|
if mapping.key?(header)
# Maps the key according to the mapping, including nil mapping
mapping[header]
elsif options[:remove_unmapped_keys]
# Remove headers not specified in the mapping
nil
else
# Keep the original header if not specified in the mapping
header
end
end
end
end
end
Loading