Monitoring CM with ELK

In this post I'll show how to use the ELK components to make an effective, free way to monitor any size Content Manager implementation.  At the end of this post we'll end up with a rudamentary dashboard like shown below.  This dashboard highlights errors as reported in the windows logs (which includes CM).

Kibana dashboard using winlogbeats and filebeats as a source

Kibana dashboard using winlogbeats and filebeats as a source

To make all this magic work I'll need to place beats in various locations.  On all of my servers I'll place both a filelogbeat and a winlogbeat.  I'll use winlogbeats to monitor windows instrumention, which will include content manager entries in system logs.  I'll also use filelogbeats to monitor the content manager specific log files and the audit logs.

To start I create one server with both installed and configured.  I'll zip this entire structure and copy it to each server in my environment.  Then I'll register the two beats without modifying the configuration.

2017-12-13_21-25-55.png

For now I'm going to focus on the winlogbeats.  I modified the configuration file so that it includes a tag I can use as a filter in saved searches, as well as information about my kibana and elasticsearch hosts.

You'd use proper server host addresses

You'd use proper server host addresses

With this saved on 3 of my servers I switch over to kibana.  Here I'll start configuring my dashboards.  The dashboards will be composed of several visuals.  Those visuals will be populated with content from elasticsearch, based on the query provided.  So to start I create several saved searches.

Here's a saved search to give me just errors from any of the servers.  I'll name it "Windows Log Error".

2017-12-13_21-47-34.png

Here's a saved search I'll name "CM Windows Log Errors".  These will come from winlogbeats on all CM servers I've installed the winlogbeats agent (but not several others which are outside of CM).

2017-12-13_21-43-25.png

Next I'll create one visual that tells me how many errors I've got overall (the first saved search).

2017-12-14_9-37-03.png

I then pick my saved search...

2017-12-13_21-57-44.png

Then configure it to split out by computer name.

2017-12-13_21-58-35.png

Then I click save and name it error counts.  Next I create a data table for the error sources, as shown below.

2017-12-13_22-01-44.png

Then a pie chart based on all of the audit events....

2017-12-13_22-02-32.png

Next I created a new dashboard and placed the three visuals onto it.  Then I added one saved search, the CM specific windows error logs.  This is the same as a data table, but thought I'd show it.

2017-12-13_22-04-12.png

Last step is to save the dashboard.  By checking the box the user can change the dashboard content by tweaking the timeline (visible on almost all pages of kibana).

2017-12-13_21-42-01.png

Now I can access this dashboard from a list of them.  Keep in mind that you can tag your development servers differently from production, so that might be a different dashboard.  You can also share them.

2017-12-13_22-05-17.png

That gives me this final dashboard shown below.  Note in the image the top-right corner where I've changed the time frame to the previous 1 year.  Then note how I added a filter to exclude one server in particular.  It's amazing what can be built.

2017-12-13_22-12-38.png

ELK your CM audit logs

In this post I'll show how I've leveraged Elasticsearch, Kibana, and Filebeats to achieve a dashboard based on my CM audit logs.  I've created an ubuntu server where I've installed all of the ELK components (note if you don't have access to one, you don't need one to play... just install it all locally).  This segregates it from Content Manager and let's me work with it independently. 

First I installed Filebeat onto the server generating my audit logs.  I configured it to search within the Audit Log output folder and to ignore the first line of each file.  You can see this configuration below.

2017-12-12_17-03-58.png

I completed the configuration by directing the content to Elasticsearch directly.  Then I started filebeat and let it run.

2017-12-13_11-27-48.png

When I inspect the index I can see it's stuffed each line of the file into a message column.

2017-12-12_18-22-22.png

This doesn't help me.  I need to break this line of data into separate fields. The old audit log viewer shows me the order of each column and what data is contained inside.  I used this to define a regular expression to extract the properties.  

2017-12-12_20-55-20.png

I used a grok pipeline processor to implement the regular expression, transform some of the values, and then remove the message field (so that it doesn't confuse things later).  The processor I came up with is as follows:

 {
        "description" : "Convert Content Manager Offline Audit Log data to indexed data",
        "processors" : [
        {
            "grok": {
                "field": "message",
                "patterns": [ "%{DATA:EventDescription}\t%{DATA:EventTime}\t%{DATA:User}\t%{DATA:EventObject}\t%{DATA:EventComputer}\t%{NUMBER:UserUri}\t%{DATA:TimeSource}\t%{DATA:TimeServer}\t%{DATA:Owner}\t%{DATA:RelatedItem}\t%{DATA:Comment}\t%{DATA:ExtraDetails}\t%{NUMBER:EventId}\t%{NUMBER:ObjectId}\t%{NUMBER:ObjectUri}\t%{NUMBER:RelatedId}\t%{NUMBER:RelatedUri}\t%{DATA:ServerIp}\t%{DATA:ClientIp}$" ] 
            }
        },
        {
            "convert": {
                "field" : "UserUri",
                "type": "integer"
            }
        },
        {
            "convert": {
                "field" : "ObjectId",
                "type": "integer"
            }
        },
        {
            "convert": {
                "field" : "ObjectUri",
                "type": "integer"
            }
        },
        {
            "convert": {
                "field" : "RelatedId",
                "type": "integer"
            }
        },
        {
            "convert": {
                "field" : "RelatedUri",
                "type": "integer"
            }
        },
            {
            "remove": {
                "field" : "message"
            }
        }],
    "on_failure" : [
          {
            "set" : {
              "field" : "error",
              "value" : "{{ _ingest.on_failure_message }}"
            }
          }
        ]
    }

Now when I test the pipeline I can see that the audit log file data is being parsed correctly.  When testing the pipeline you submit the processor and some sample data.  It then returns the results of the processor.

Result of posting to the pipeline simulator

Result of posting to the pipeline simulator

As shown below, I used postman to submit the validated pipeline to elasticsearch and name it "cm-audit"...

2017-12-12_23-07-04.png

Now I need to go back to my Content Manager server and update the configuration of Filebeats.  Here I'll need to direct the output into the newly created pipeline.  I do this by adding a pipeline definition to the Elasticsearch output options.

2017-12-12_21-58-51.png

Next I stop Filebeat and delete the local registry file in ProgramData (this let's me re-process the audit log files). 

2017-12-13_11-30-22.png

Before starting it back up though, I need to delete the existing index.

Delete action in the Elasticsearch Head

Delete action in the Elasticsearch Head

Now I can start it back up and let it populate elasticsearch.  If I check the index via kibana I can see my custom fields for the audit logs are there.

2017-12-12_22-30-48.png

Last step, create some visualizations and then place onto a dashboard.  In my instance I've also setup winlogbeats (which I'll do another post on some other day), so I have lots of information I can use in my dashboards.  For a quick example I'll show the breakdown of event types and users.

2017-12-12_22-28-29.png