Stress Testing GCP CM via ServiceAPI and jmeter

Let's see how my GCP hosted CM infrastructure holds up under ridiculous load.  

jmeter can be downloaded here.  Since this is a pure Java application you'll need to have java installed (I'm using JRE 9 but most should use JDK 8 here).  Note that JDK v10 is not yet officially supported.

After extracting jmeter you launch it by executing jmeter.bat...

2018-05-12_6-39-04.png

Every test plan should have a name, description, and a few core variables...

There are many, many possible ways to build out a test plan.  Each will have at least one thread group though, so I'll start by creating just one.  

2018-05-12_6-49-13.png

As shown below, I created a thread group devoted to creating folders.  I also set error action to stop test.  Later I'll come back and increase the thread properties, but for now I only want one thread so that I can finish the configuration.

Next I'll add an HTTP Request Sampler, as shown below...

2018-05-12_7-37-48.png

The sampler is configured to submit a Json record definition, with values for Title and Record Type.  This is posted to the ServiceAPI end-point for records.  It's as basic as you can get!

I'll also need an HTTP Header Manager though, so that the ServiceAPI understands I'm intending to work with json data.

2018-05-12_7-34-29.png

Last, I'll add a View Results Tree Listener, like so...

2018-05-12_7-40-52.png

Now I can run the test plan and review the results...

2018-05-12_7-46-35.png

After fixing the issue with my json (I used "RecordRecordTitle" instead of "RecordTypedTitle"), I can run it again to see success.

2018-05-12_7-48-46.png

Now I'll disable the view results tree, add a few report listeners, and ramp up the number of threads to 5.  Each thread will submit 5 new folders.  With that I can see the average throughput.

2018-05-12_7-51-36.png

Next I'll increase the number of folders, tweak the ramp-up period, and delay thread creation until needed.  Then I run it and can review a the larger sample set.

2018-05-12_7-54-24.png

This is anecdotal though.  I need to also monitor all of the resources within the stack.  For instance, as shown below, I can see the impact of that minor load on the CPU of the workgroup server.  I'd want to also monitor the workgroup server(s) RAM, disk IO, network IO, and event logs.  Same goes for the database server and any components of the stack.

2018-05-12_7-58-44.png

From here the stress test plan could include logic to also create documents within the folders, run complex searches, or perform actions.  There are many other considerations before running the test plan, such as: running the plan via command line instead of GUI, scheduling the plan across multiple workstations, calculating appropriate thread count/ramp-up period based on infrastructure, and chaining multiple HTTP requests to more aptly reflect end-user actions.

For fun though I'm going to create 25,000 folders and see what happens!

2018-05-12_8-32-43.png

Using GCP to alert CM administrators when things break

It's crazy how simple it is to set this up within GCP.  You can even do this with your internally hosted CM servers (assuming you don't mind log shipping to your super secure private cloud)!  In a previous post I installed Stackdriver on my VM and showed how the logs appear within the UI.

I dare say a majority of the times CM breaks an error entry is generated within the application log.  Stackdriver is sending me those entries. As shown below, here's an entry from CM stemming from me having (purposefully) moved a document store.

 
2018-04-25_22-09-37.png
 

Similar entries will be generated when the database goes down, document stores are full, and when IIS resets on me.  More importantly, it's super easy to push TRIMWorkgroup log file entries into the event logs.  Also, many integrations can have their log4net logging mechanisms redirected to the event logs!!!!

From the logging interface I can select Logs-Based metrics on the left and then click Create Metric...

 
 

Next I gave it an easy to understand name and a set of filters that drill down to just the errors from the workgroup service.  More advanced filters can be defined so that more (or less) is included.  This serves my purposes for now.

 
 

After hitting save, I can select Create Alert from Metric....

 
2018-04-25_22-29-04.png
 

Here I just need to provide an interval which would trigger the alert...

 
2018-04-25_22-32-34.png
 

Once that's saved I can configure one or more notifications.  Here I'm sending an email, but options exist for text, slack channel post, or smoke signal.  Pretty slick.

 
 

After it was all saved I went and triggered an error by trying to open one of those missing documents.  Look what I got via email about a minute later!

 
2018-04-25_22-40-59.png
 

Time to start cranking out preset filters for customers!

Monitoring CM Cloud Instance Resources with Stackdriver

It's saving me a tremendous amount of time having Content Manager in my secure private cloud!  I'd like to monitor the environment though; and for that I'll use Stackdriver.  The free tier gives me everything I need for my current usage.  As I ramp up my implementation though I'll need to expand its' usage, so understanding the pricing model is a must.

First things first.... I need to install it on my VM by using this command:

invoke-webrequest "https://dl.google.com/cloudagents/windows/StackdriverLogging-v1-8.exe" -OutFile "StackdriverLogging-v1-8.exe";
.\"StackdriverLogging-v1-8.exe"

Then ran through the installer as with any other application:

2018-04-24_19-51-34.png
2018-04-24_19-55-13.png

Next I flip over to my Stackdriver homepage and BAM.... everything's already done for me:

2018-04-24_19-51-56.png

Now a natural question would be "what about content manager logs and events?".  This can be easily done!  The logging agent support ruby expressions, as shown below.

2018-04-24_20-04-41.png

I can re-use the custom grok filters I created within Elasticsearch to parse CM logging sources!  A topic for another day!  For now I'll create an alerting policy to keep me in-the-know.

2018-04-24_20-08-41.png

What I find most useful here is that you can see information about the conditions you're trying to set.  So helpful to be able to see some historical data for the metric I'm configuring an alert on.  

2018-04-24_20-10-51.png

The rest is pretty self-explanatory!  :)