Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use synonyms cvs file to generate extra suggest records #17

Merged
merged 3 commits into from
Dec 16, 2024

Conversation

kad-dirc
Copy link
Collaborator

@kad-dirc kad-dirc commented Dec 12, 2024

Description

use synonyms cvs file to generate extra suggest records

Type of change

  • New feature

Checklist:

  • I've double-checked the code in this PR myself
  • I've left the code better than before (boy scout rule)
  • The code is readable, comments are added that explain hard or non-obvious parts.
  • I've expanded/improved the (unit) tests, when applicable
  • I've run (unit) tests that prove my solution works
  • There's no sensitive information like credentials in my PR

if err != nil {
return nil, err
}
var fieldValuesByNameExtensions = make(map[string][]string)
synonyms, err := readSubstitutionsFile(synonymsFile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zou hier readCsvFile van maken aangezien het voor zowel subst als synonyms wordt gebruikt

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goed idee. DONE

}
}
}

// Todo: avoid getting duplicates (for performance improvement)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moet dit nog?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Het is meer bedoelt remark. Ik verwacht geen performance issues, maar in theorie kan het efficienter.

@@ -134,3 +163,16 @@ func readSubstitutionsFile(filepath string) (map[string]string, error) {
}
return substitutions, err
}

func reverseMap(m map[string]string) (map[string]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gomapgpie (en gokoala want daar komt het uit) is al een util.Inverse() function beschikbaar om een map om te draaien. Misschien kan je die gebruiken? Zo niet dan geen big deal, maar zou mooi zijn als het kan.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Die had ik niet gezien, super! DONE

if !tt.wantErr(t, err, fmt.Sprintf("generateAllFieldValues(%v, %v, %v)", tt.args.fieldValuesByName, tt.args.substitutionsFile, tt.args.synonymsFile)) {
return
}
assert.Equalf(t, tt.want, got, "generateAllFieldValues(%v, %v, %v)", tt.args.fieldValuesByName, tt.args.substitutionsFile, tt.args.synonymsFile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gebruik eigenlijk altijd assert.Equal ipv assert.Equalf, is er een reden om de formatting variant te gebruiken?

valueLower := strings.ToLower(value.(string))

// Get all substitutions
substitutedValues, err := extentValues([]string{valueLower}, substitutions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extent vs extend (filename is extend). Moet extend zijn volgens mij (snap de extent wel gezien de geo werkzaamheden 😉 )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) deze zat op meer plekken. DONE

func generateFieldValuesSubstitutions(fieldValuesByName map[string]any, substitutionFile string) ([]map[string]any, error) {
substitutions, err := readSubstitutionsFile(substitutionFile)
// Return slice of fieldValuesByName
func generateAllFieldValues(fieldValuesByName map[string]any, substitutionsFile, synonymsFile string) ([]map[string]any, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misschien zou je dit extendFieldValues kunnen noemen? Dan sluit het wat meer aan bij de filename.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goed idee! DONE

}
return generateAllFieldValuesByName(fieldValuesByNameExtensions), err
return generateAllFieldValuesByName(fieldValuesByNameWithAllValues), err
}

// Transform a map[string][]string into a []map[string]string using the cartesian product, i.e.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zit niet in deze change maar een recursive functie kan je bijna altijd ook zonder recursie schrijven. generateCombinations is recursief, heb een poging gedaan deze zonder recursie op te lossen:

func generateCombinations(keys []string, values [][]string) []map[string]any {
	if len(keys) == 0 || len(values) == 0 {
		return nil
	}
	result := []map[string]any{{}} // contains empty map so the first iteration works
	for keyDepth := 0; keyDepth < len(keys); keyDepth++ {
		var newResult []map[string]any
		for _, entry := range result {
			for _, val := range values[keyDepth] {
				newEntry := make(map[string]any)
				for k, v := range entry {
					newEntry[k] = v
				}
				newEntry[keys[keyDepth]] = val
				newResult = append(newResult, newEntry)
			}
		}
		result = newResult
	}
	return result
}

En de test:

func Test_generateCombinations(t *testing.T) {
	type args struct {
		keys   []string
		values [][]string
	}
	tests := []struct {
		name string
		args args
		want []map[string]any
	}{
		{"Single key, single value", args{[]string{"key1"}, [][]string{{"value1"}}}, []map[string]any{{"key1": "value1"}}},
		{"Single key, slice of values", args{[]string{"key1"}, [][]string{{"value1", "value2"}}}, []map[string]any{{"key1": "value1"}, {"key1": "value2"}}},
		{"Two keys, two single values", args{[]string{"key1", "key2"}, [][]string{{"value1"}, {"value2"}}}, []map[string]any{{"key1": "value1", "key2": "value2"}}},
		{"Two keys, slice + single value", args{[]string{"key1", "key2"}, [][]string{{"value1", "value2"}, {"value3"}}}, []map[string]any{{"key1": "value1", "key2": "value3"}, {"key1": "value2", "key2": "value3"}}},
		{"Two keys, two slices values", args{[]string{"key1", "key2"}, [][]string{{"value1", "value2"}, {"value3", "value4"}}}, []map[string]any{{"key1": "value1", "key2": "value3"}, {"key1": "value1", "key2": "value4"}, {"key1": "value2", "key2": "value3"}, {"key1": "value2", "key2": "value4"}}},
	}
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			assert.Equalf(t, tt.want, generateCombinations(tt.args.keys, tt.args.values), "generateCombinations(%v, %v)", tt.args.keys, tt.args.values)
		})
	}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mooi en clean! Heb hem overgenomen

@@ -133,3 +161,28 @@ func Test_readSubstitutions(t *testing.T) {
})
}
}

func Test_reverseMap1(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map1 ipv Map.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

niet meer relevant nu ik de util.Inverse() gebruik

@kad-dirc kad-dirc force-pushed the PDOK-17192/synoniem-subst branch from 849aa9a to 9ef1454 Compare December 16, 2024 09:45
@kad-dirc kad-dirc merged commit 22eb8bd into master Dec 16, 2024
3 checks passed
@rkettelerij rkettelerij deleted the PDOK-17192/synoniem-subst branch December 17, 2024 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants