Scheduling a Routine Json Data Import via DataPort

I need to schedule my Json webservice daily import!  

To start I right-clicked on my start button and selected Computer Management.  I then right-clicked on the top node, selected Connect to another Computer, and entered the fully qualified name of my Content Manager workgroup server.  I'm putting the routine onto the server since they typically have the appropriate power saving, backup/recovery, and management settings I need for monitoring.

2017-09-22_15-45-49.png

Once connected I clicked into the system tools, into the task schedule, and then created a new basic task.  I should have first reviewed the existing set of tasks (so as to make sure I didn't already have one defined for this task), but I'll skip that for now.  

2017-09-22_15-45-49.png

First I enter a name (using whatever naming pattern the environment requires, remembering this is on the server and not local) and then provide a reasonable description.  

Task Scheduling Wizard - Name and Description

Task Scheduling Wizard - Name and Description

A setting of daily is sufficient for my frequency...

Task Scheduling Wizard - Trigger Settings

Task Scheduling Wizard - Trigger Settings

I adjusted the time of the day and then clicked next...

2017-09-22_15-45-49.png

Since I couldn't change the action type, I clicked Next right past the "action type" selection wizard page (not shown here).  I then clicked browse and navigated to the TRIMDataPort.exe file located in the Content Manager installation directory.  This works for me because I've got DataPort installed in the exact same location on both my workstation and server.  

Task Scheduling Wizard - Action Program Settings

Task Scheduling Wizard - Action Program Settings

For the of the program I used the following value:

"\\wg1.cmramble.local\dataPort\samhsa\florida facilities.dpproj"

This is a share I created off my workgroup server.  It holds my data port project, the project's Json configuration source, and a logging folder.  Once everything's entered, I clicked Next and double-checked everything on the final wizard step. I need to make sure this executes with the appropriate identity and permissions, so I also checked the box towards the bottom before clicking Finish.

Task Scheduling Wizard - Finish

Task Scheduling Wizard - Finish

This brings up the advanced properties of the task just created by the wizard.  The wizard sets the security options to "Run only when user is logged on", which I will change to "Run whether a user is logged on or not" and I also check "Run with highest privileges".  After clicking OK I have to enter my user name/password again.

Task Properties - Security Options and Credentials

Task Properties - Security Options and Credentials

My Json data source will now be imported every day at 8pm.  Next I need to get my web interface running so I can see these boxes, add the map, and determine how best to scour for electronic records.

Sourcing from a Webservice

This post continues my Mental Health Awareness Week project. Refer to the original post for background information about the goal of this project.

After deciding to build a box for each facility, I realized I never actually had a unique ID column in my source spreadsheet.  That means I can't maintain a routine import/update from that spreadsheet.  So I need to find an alternative source of meta-data.

Pressing F12 in Chrome lets me inspect what's happening when I use the SAMHSA site.  Reloading the page and running the search for Florida again lets me see all of the end-points (web URLs) this User Interface touched.  Hovering over the entries gives me insight into the nature of the network traffic..

2017-09-20_8-52-59.png

I can see with this one entry that Chrome received back a listing of facilities in Json form.  It was made over a secure connection and took 534 ms seconds to retrieve.  The response tab provides me with the full content of the request's response, which I copied to my clipboard.

2017-09-20_8-58-51.png

I then opened the Online JSON Viewer website in a new tab, pasted the response, and clicked the viewer tab.  It gave me a good, quick visualization of this data.

2017-09-20_9-04-43.png

I've found my unique ID in the "frid" property of each "row" of the results.  I also can see that there are 946 results spanning across 32 pages of results.  I can also see that there are properties on the rows which are misleading, such as: _irow, _rclass, rnum.  I should really take note of that or ensure my solution allows me to exclude the visibility of those properties. 

I copied the URL for the original network event and removed the paging feature.  That results in a single web request for all the Florida facilities being returned in 4.3 seconds.  This will become my source of facility boxes (for the foreseeable future).

After flipping over to DataPort I realize I don't have any way to routinely import from Json.  No worries!  I took my last DataPort ImportDataFormatter and whipped up a new Json importer.

2017-09-21_21-08-07.png

I've now got some much needed functionality:

  1. I can select Json as a potential import source
  2. I can specify a webservice end-point
  3. I can specify a path to the records in the Json data
  4. I can build up a transformation map between the Json data and fields in Content Manager
  5. I can prevent the exposure of properties from the Json data
  6. I can format data to match CM requirements

When I run it via DataPort it behaves as expected....

2017-09-21_21-08-06.png

The boxes created during the initial import will now be updated or added to, without me having to do anything.  Furthermore, I can now import data from any Spreadsheet or Webservice I come across.  And all this cool DataPort work has me thinking bigger.

I'm now ready to start finding records to place within these boxes.  Before I start visiting websites of regulators, facility operators, and governmental bodies, I worry how I'm going to capture all the things I come across.  I might need some new tools to help me import in bold new ways......

 

Importing Spreadsheets (regardless of format)

This post continues my Mental Health Awareness Week project. Refer to the original post for background information about the goal of this project.

So now I have a very rudimentary dataset designed and ready to be populated with data.  I've also got a spreadsheet from the SAMHSA website, which contains all of the facility's meta-data.  There's just one small hitch for me personally: I don't have Excel, just Google Sheets.  I can upload it to Google Sheets, export it to a Tab Delimited file, and then import that file via DataPort.  

Later on down the track I'll receiving additional spreadsheets from different sources, so this work process actually kind of sucks.  I need a way to import a spreadsheet into Content Manager, but without needing to convert files before importation.  Also, who knows what format the files might be in.... Excel 2007, Excel 2013, OFD, CSV, etc.  

In my first post on this blog I created a custom import data formatter that automatically added a Byte Order Mark to Tab delimited files.  It was crazy easy to create that thing.  The same approach I took there can be taken here: build something within the DataPort framework that solves my problem.  

So I took 30 minutes and created a new spreadsheet data import formatter by using the OpenXml and Spreadsheet Light libraries.  This gives me a new option when using DataPort, as shown in the image below.

2017-09-14_11-44-55.png

After I've selected the newly available File Format of Spreadsheet and picked a file from my computer, DataPort shows me all of the available columns of data (this does require that the file have a header row with column names).  I can then match my columns just like I would for the out-of-the-box Tab Delimited Data Formatter.  Again, this file can be in any spreadsheet format!

2017-09-14_11-44-55.png

Once executed, DataPort will even show me exactly how many rows and columns were in the spreadsheet.  It then processes it just like it would for a tab delimited file.  The image below shows what the output looks like in DataPort.

2017-09-14_11-44-55.png

Yay!  I've not got some facility boxes in my CM instance.  :)

2017-09-14_11-44-55.png