Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kibana][Metrics] Dashboard that shows only latest results of each scan for each month #111

Closed
qmontal opened this issue Oct 18, 2018 · 12 comments
Assignees

Comments

@qmontal
Copy link
Contributor

qmontal commented Oct 18, 2018

We are having an issue when reporting vulnerabilities with Kibana Dashboards, as if you have run a scan once a week, the results per month will obviusly appear x4 times as all scans are feeded; it would be good when trying to get overall time metrics, that it only shows the results of one scan per month.

@cybergoof
Copy link
Contributor

Hi, I was just about to start thinking through this problem. have you started looking at it?

@qmontal
Copy link
Contributor Author

qmontal commented Dec 4, 2018

Hi @cybergoof,
Not yet, this was something that I wanted to do after designing the Vulnerability Standard (#81) that all modules should follow and refactoring the code, as then we would have one single logstash config file and ideally we would make it use the ECS standard (#97).

@cybergoof
Copy link
Contributor

I am not great on elk aggregations, but I think maybe that is the way to go. However, that means really changing the dashboard. Almost like two dashboards. One that shows "Current state of your environment", and then another dashboard that shows historical information?

With the aggregation, could we create a search that brings back just the latest scan of every host? And then build the visualizations based on that?

@qmontal
Copy link
Contributor Author

qmontal commented Dec 5, 2018

It wouldn't be needed to have the two dashboards, you could just deploy from scratch the environment, and as all the data is pulled from the scanners, it would be downloaded and structured with the new format to ELK.
It would indeed be a bit of a pain to maintain two dashboards, but one of the reasons of doing this is that already the different scanners have different variables or same variables with different corresponding values in ELK, due to the current formatting not being unified.

I am not either great with ELK aggregations, but I believe it should be able to do that; not that it is directly related but Splunk has a way to get the latest values submitted regarding logs, so I believe ELK should also have the option. The difficult thing I guess would be to have it with the latest for each month, some research is needed on that.

@elvarb
Copy link

elvarb commented Dec 6, 2018

It would be possible to do this with the fingerprint filter in Logstash to create a unique ID per scanning target. So from one field or a combination of fields a MD5 or Sha1 hash is created.

Then in the end two different writes to Elasticsearch need to happen.

  1. As normal, Elasticsearch creates the document unique ID and it gets written to the "historical" index.
  2. New write, use the ID generated as the Elasticsearch ID and write that to a "current" index. When a document gets written to Elasticsearch with the same ID as an existing document the existing document gets overwritten.

With this you can then create a current dashboard that is maybe looking at the last 30 days.

  1. Targets scanned multiple times in the last 30 days will show only the most recent results.
  2. Targets only scanned once will as normal only show up.
  3. Targets that have not been scanned in the last 30 days will not be shown. For example after removing a server from the environment, because of the time filter, old and stale data will be automatically be phased out.

This also means that you now have the possibility of writing the current state to a different Elasticsearch server. In companies with a dedicated small security team and a much larger ops team the impact of a large number of users from the ops team querying the historical data while they only want the most recent scans most of the time could cause performance issues. The current index set will always be much smaller than the full historical dataset. Another benefit of this way would be if you want to limit access to the historical data and allow a larger number of people access the current data.

@cybergoof
Copy link
Contributor

cybergoof commented Dec 6, 2018

Oh, um, I didn't know about that approach. Okay, now i am going to work on doing that.

You can assign me this ticket if you want. I can create the index I one, as an optional config file?

@elvarb
Copy link

elvarb commented Dec 6, 2018

On second thought...since each scan contains multiple documents you would somehow have to make sure that repeat findings are fingerprinted the same. If you only use the Hostname then it would end up as a single document, overwriten again and again.

Figuring out what fields to use is the tricky part

@qmontal qmontal assigned qmontal and cybergoof and unassigned qmontal Dec 7, 2018
@qmontal
Copy link
Contributor Author

qmontal commented Dec 7, 2018

Thanks for offering @cybergoof ^^ I was hoping for a way of filtering that in Kibana/ElasticSearch directly without need of creating more entries, but I am no ELK expert at all, so if this is the best way, we should go for it.

@elvarb how should we define that? Could we make is part of the ECS(#97), as we would like to eventually use that, or should we have a custom field?

Mentioning @austin-taylor to be on the loop and make sure we are all aligned on it :)

@cybergoof
Copy link
Contributor

Looking at the data, timestamp + hostname can be a unique scan of a host. With multiple documents for each vuln found. How do we group those to only display the latest GROUP as valid?

@cybergoof
Copy link
Contributor

Okay, I think I have a better one. ASSET+"_"+PLUGIN_ID as the document ID? This would make sure that every finding is only recorded one.

There are some problems with this. It will overwrite the original time the scan detected. It won't distinguish if a vuln is gone. It also can't distinguish from if a vuln is detected, then not detected (remediated) then detected again.

@qmontal
Copy link
Contributor Author

qmontal commented Dec 11, 2018

Hey @cybergoof,

Being these part of the structure design, and not being myself right now working with the Kibana part, I would be more comfortable if you agreed with @austin-taylor regarding the changes.

On my side, I don't think I will be able to work with the Kibana part of my plans in some time as I want to do the redesign of the VulnWhisperer standard and make all modules follow that standard, so these changes will be something tracked aside from the master branch until everything is changed and properly tested. Apologies for not being helpful with the Kibana part at the moment.

Thanks for your help!

@cybergoof
Copy link
Contributor

First, I am really really really bad at Elastic query language. It just doesn't make sense to me.

However, I think that this aggregation demonstrates what I am talking about. Running this query will return the earliest time that an asset had a plugin fire. Changing from asc to desc gives the earliest. Since script aggregations are not used in Kibana visualizations, I think this would have to be a fingerprint. I can test it out. But would like someone else to evaluate

GET logstash-vulnwhisperer-2018*/_search
{
  "size": 0,
  
  "aggs": {
    "2": {
      
      "terms": {
       
          "script": {
              "source": "doc['asset.keyword'].value+'-'+doc['plugin_id'].value",
              "lang": "painless"
              
          },
          
        "size": 300,
        "order": {
          "_term": "asc"
        }
      },

      "aggs": {
        
        "1": {
          
          "top_hits": {
            "docvalue_fields": [
              "error"
            ],
            
            "size": 1,
            "sort": [
              {
                "@timestamp": {
                  "order": "asc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants