Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify fields.yml generator #7713

Merged
merged 26 commits into from
Jul 26, 2018

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Jul 24, 2018

This PR simplifies the process of generating a new fields.yml file for a Beat.

  • libbeat/fields.yml is not a dependency anymore
  • only fields.yml is written to disk

Changes in the asset files are due to different number of newlines in generated fields.yml files.

Interface changes of generator tools

Both dev-tools/cmd/asset/asset.go and libbeat/scripts/cmd/global_fields/main.go is changed. The first one has two new flags: -in, and -out. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag -out which specifies the output file. If it's not set, the output is written to stdout.

This makes it possible to generate asset files for Metricbeat without writing libbeat/fields.yml to disk. It will be useful in the future when more Beats have modular assets.

Closes #7671

@kvch kvch added review libbeat :Generator Related to code generators for building custom Beats or modules. labels Jul 24, 2018
@kvch kvch requested a review from ruflin July 24, 2018 19:26
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turn-around.

libbeat/Makefile Outdated
@@ -6,4 +6,4 @@ include scripts/Makefile

# Collects all dependencies and then calls update
.PHONY: collect
collect: libbeat_fields
collect: fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to give update a direct dependency on fields?

@@ -309,13 +309,9 @@ coverage-report:
fields: mage
mage -v fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you set this line to @mage fields. I was debugging when changed it to its current state and didn't undo my changes.

@andrewkroh
Copy link
Member

I notice a missing newline from one of the commands. Maybe you can track it down.

ERROR: Field <beat.name> is duplicated. Please update and try again.exit status 1
                                                                   ^

}

if len(beatFieldsPaths) == 0 {
if len(beatFieldsPaths) == 0 && *esBeatsPath == *beatPath {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What condition is *esBeatsPath == *beatPath checking for?

Is this an assertion that the command is not being run from the elastic/beats dir? If so, comparing the two path strings without normalization isn't advisable; at a minimum I would filepath.Clean them. Using os.SameFile would be the most reliable why to check that condition, but why event check it at all.

}

err = fields.Generate(*esBeatsPath, *beatPath, fieldsFiles)
err := fields.Generate(*esBeatsPath, *beatPath, fieldsFiles)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about internalizing the collection of module files to the fields.Generate(). Make it accept a list of module directories like Generate(elasticBeatsDir, beatDir string, moduleDirs ...string) ([]byte, error).

Note that I added a []byte return which is the actual content. Then the caller becomes responsible for writing the content which makes it a bit more reusable. I think the YAML content is small enough to be rendered to memory without any issues.

commonFields = append(
[]string{
filepath.Join(esBeatsPath, "libbeat", "_meta", "fields.common.yml"),
filepath.Join(beatPath, "_meta", "fields.yml"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think _meta/fields.yml is not used anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a place that has some old references to fields files that needs updated to point to plain old fields.yml.

if fields_doc is None:
fields_doc = self.beat_path + "/_meta/fields.generated.yml"
def extract_fields(doc_list, name):
fields = []
dictfields = []
if doc_list is None:
return fields, dictfields
for field in doc_list:
# Skip fields without name entry
if "name" not in field:
continue
# Chain together names
if name != "":
newName = name + "." + field["name"]
else:
newName = field["name"]
if field.get("type") == "group":
subfields, subdictfields = extract_fields(field["fields"], newName)
fields.extend(subfields)
dictfields.extend(subdictfields)
else:
fields.append(newName)
if field.get("type") in ["object", "geo_point"]:
dictfields.append(newName)
return fields, dictfields
# Not all beats have a fields.generated.yml. Fall back to fields.yml
if not os.path.isfile(fields_doc):
fields_doc = self.beat_path + "/_meta/fields.yml"

@andrewkroh
Copy link
Member

These lines should be removed now:

echo "Building libbeat fields.yml"
cd libbeat
exec { mage fields }
cd ..

@@ -73,7 +73,7 @@ test-module: python-env update metricbeat.test
. ${PYTHON_ENV}/bin/activate && INTEGRATION_TESTS=1 nosetests tests/system/test_${MODULE}.py

.PHONY: assets
assets: libbeat_fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't think Metricbeat is free from this dependency yet due to the way that the fields.go files are composed.

One solution would be to include both metricbeat common fields and libbeat fields in metricbeat/include/fields.go. This could be generated by running global_fields without passing in the module dir as an argument.

It's be nice if we could pipe the output of one command into another like:

go run ${ES_BEATS}/libbeat/scripts/cmd/global_fields/main.go -es_beats_path ${ES_BEATS} -beat_path ${BEAT_NAME} | go run ${ES_BEATS}/dev-tools/cmd/asset/asset.go -pkg include -in - -out include/fields.go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hope this becomes obsolete again in the near future :-)

@ruflin
Copy link
Member

ruflin commented Jul 25, 2018

@elastic/apm-server FYI, this change is happening in master.

@kvch kvch added the needs_backport PR is waiting to be backported to other branches. label Jul 25, 2018
os.Exit(2)
if len(beatFieldsPaths) == 0 && os.SameFile(esBeatsInfo, beatInfo) {
if !*toStdout {
fmt.Println("No field files to collect")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might consider using the stdlib log package in the file and the util packages. Just initialize it with log.SetFlags() to remove the timestamps. All messages will go to stderr all the time.

And all the places where it logs and exits can use a log.Fatal*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep it as it is right now, because this is the "idiomatic" Beats way when printing to console during scripts or commands. I see that you started to use log in mage files, but apart from that fmt package is used all over the codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh Wasn't aware of this feature. We definitively should start using this in the future.

}
esBeatsInfo, err := esBeats.Stat()
if err != nil {
fmt.Fprintf(os.Stderr, "Error getting file info of elastic/beat: %+v\n", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elastic/beats (It's missing the s.)

@@ -29,6 +29,7 @@ import (
func main() {
esBeatsPath := flag.String("es_beats_path", "..", "Path to elastic/beats")
beatPath := flag.String("beat_path", ".", "Path to your Beat")
toStdout := flag.Bool("stdout", false, "Write output to stdout")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd prefer to standardize on using -out to specify the output file in our CLI tools. And make the default be - mean output to os.Stdout like most unix tools. Then you get the same behavior across all tools. The downside being that you have to write -out fields.yml in the common case, but the common case is almost always done via make so it's not actually a problem.

var err error
if !isLibbeat(beatPath) {
commonFields = append(commonFields,
filepath.Join(esBeatsPath, "libbeat", "_meta", "fields.common.yml"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW filepath.Join always calls filepath.Clean on the result. https://golang.org/pkg/path/filepath/#Join

This means you can write filepath.Join(esBeatsPath, "libbeat/_meta/fields.common.yml") which is more clear IMO.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can you rebase this on master to get my changes to fix go test on Windows.

fmt.Fprintf(os.Stderr, "Cannot generate global fields.yml for %s: %+v\n", name, err)
os.Exit(2)
if len(beatFieldsPaths) == 0 && os.SameFile(esBeatsInfo, beatInfo) {
if *output != "-" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be *input?


func init() {
pkg = flag.String("pkg", "", "Package name")
input = flag.String("in", "-", "Source of input. \"-\" means reading from stdin")
Copy link
Member

@andrewkroh andrewkroh Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to switch this file over to flag.StringVar so we don't have to dereference string pointers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kvch Want to do this in this PR or follow up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I forgot to push this commit.

@kvch kvch force-pushed the refactor/libbeat/one-fields-yml branch from 9cac467 to 6ff6069 Compare July 26, 2018 07:29
@kvch
Copy link
Contributor Author

kvch commented Jul 26, 2018

Rebased and refactored.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change LGTM, only a few minor questions.

@@ -71,7 +71,7 @@ func CollectModuleFiles(modulesDir string) ([]*YmlFile, error) {
func CollectFiles(module string, modulesPath string) ([]*YmlFile, error) {

var files []*YmlFile
fieldsYmlPath := filepath.Join(modulesPath, module, "_meta", "fields.yml")
fieldsYmlPath := filepath.Join(modulesPath, module, "_meta/fields.yml")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason you introduced the / here? Could break on Windows. Any issues with the previous option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andrew pointed out that it is working everywhere in #7713 (comment)

@@ -91,7 +91,7 @@ func CollectFiles(module string, modulesPath string) ([]*YmlFile, error) {
if !s.IsDir() {
continue
}
fieldsYmlPath = filepath.Join(modulesPath, module, s.Name(), "_meta", "fields.yml")
fieldsYmlPath = filepath.Join(modulesPath, module, s.Name(), "_meta/fields.yml")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above.

if err != nil {
fmt.Fprintf(os.Stderr, "Cannot generate global fields.yml for %s: %+v\n", name, err)
os.Exit(2)
if len(beatFieldsPaths) == 0 && os.SameFile(esBeatsInfo, beatInfo) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to follow what this does. Could you quickly elaborate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a community Beat does not have its ownfields.yml file, it still requires the fields coming from libbeat to generate e.g assets. In case of Elastic Beats, it's not a problem because all of them has unique fields.yml files somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details. Could you perhaps add a comment to the line to make sure if we refactor this in the future we don't get rid of it?

@@ -73,7 +73,7 @@ test-module: python-env update metricbeat.test
. ${PYTHON_ENV}/bin/activate && INTEGRATION_TESTS=1 nosetests tests/system/test_${MODULE}.py

.PHONY: assets
assets: libbeat_fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hope this becomes obsolete again in the near future :-)

@@ -1,2083 +0,0 @@
- key: common
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this was a left over file that should already have been deleted before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is a duplication of packetbeat/_meta/fields.common.yml with fields which coming from protos. Nothing used this file at all.

@ruflin ruflin merged commit 5797005 into elastic:master Jul 26, 2018
@ruflin
Copy link
Member

ruflin commented Jul 26, 2018

@kvch Thanks, could you follow up with the @elastic/apm-server team?

@jalvz
Copy link
Contributor

jalvz commented Jul 26, 2018

already aware, thnks ;)

@kvch kvch added v6.4.0 and removed needs_backport PR is waiting to be backported to other branches. labels Jul 27, 2018
kvch added a commit to kvch/beats that referenced this pull request Jul 27, 2018
This PR simplifies the process of generating a new `fields.yml` file for a Beat.
* `libbeat/fields.yml` is not a dependency anymore
* only `fields.yml` is written to disk

Changes in the asset files are due to different number of newlines in generated `fields.yml` files.

### Interface changes of generator tools
Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout.

This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets.

Closes elastic#7671
(cherry picked from commit 5797005)
@kvch kvch added the v6.5.0 label Jul 27, 2018
ruflin pushed a commit that referenced this pull request Jul 30, 2018
This PR simplifies the process of generating a new `fields.yml` file for a Beat.
* `libbeat/fields.yml` is not a dependency anymore
* only `fields.yml` is written to disk

Changes in the asset files are due to different number of newlines in generated `fields.yml` files.

### Interface changes of generator tools
Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout.

This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets.

Closes #7671
(cherry picked from commit 5797005)
kvch added a commit to kvch/beats that referenced this pull request Sep 25, 2018
This PR simplifies the process of generating a new `fields.yml` file for a Beat.
* `libbeat/fields.yml` is not a dependency anymore
* only `fields.yml` is written to disk

Changes in the asset files are due to different number of newlines in generated `fields.yml` files.

### Interface changes of generator tools
Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout.

This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets.

Closes elastic#7671
(cherry picked from commit 5797005)
kvch added a commit that referenced this pull request Sep 25, 2018
* Simplify fields.yml generator (#7713)

This PR simplifies the process of generating a new `fields.yml` file for a Beat.
* `libbeat/fields.yml` is not a dependency anymore
* only `fields.yml` is written to disk

Changes in the asset files are due to different number of newlines in generated `fields.yml` files.

### Interface changes of generator tools
Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout.

This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets.

Closes #7671
(cherry picked from commit 5797005)

* update fields of Packetbeat
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…ic#7777)

* Simplify fields.yml generator (elastic#7713)

This PR simplifies the process of generating a new `fields.yml` file for a Beat.
* `libbeat/fields.yml` is not a dependency anymore
* only `fields.yml` is written to disk

Changes in the asset files are due to different number of newlines in generated `fields.yml` files.

### Interface changes of generator tools
Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout.

This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets.

Closes elastic#7671
(cherry picked from commit ec2e022)

* update fields of Packetbeat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Generator Related to code generators for building custom Beats or modules. libbeat review v6.4.0 v6.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants