Designing a CM Dataset

This post continues my Mental Health Awareness Week project. Refer to the original post for background information about the goal of this project.

It's time to configure my empty dataset!  This means using the CM features to structure my data in a way that makes it easy to find records for a mental health facility.  To me that means I must decide: do I create boxes or not?  To decide this I must analyze my data, the records, and CM features/behavior.

My sources of data include:

  1. A list of provider facilities (in excel)
  2. Florida Department of Children and Family regulatory records
  3. Images, forms, and documents from facility websites
  4. News articles culled from Google News

Since the ultimate goal is to relate the records to the facilities, we need to store the facility meta-data somewhere.  The meta-data includes: ID number, name, address, owner, and website.  Where do I store that?  The design of CM gives me three immediate, obvious options: a location, a classification, or a record type.  I've listed them in that order because that's the progressive order of building blocks within CM.  

  • A location defines a historical person, place, or thing.  There are pre-defined properties on a location for a name, address, ID number, and website.  When a record is created the location can be attached.  Users can then find all records by searching for the usage of the location.
  • A classification is akin to the dewey decimal system: a hierarchical taxonomy used to organize information.  There are pre-defined properties for the owning location, retention schedule, and security; all of which are copied to any new record associated with a record.  If each facility is created as a classification, then users can find all records by searching within the classification.
  • A record type is the definition of a type of record (go figure!).  They can be configured to use specific locations, classifications, meta-data fields, numbering patterns, and security.  They also have various behavioral settings/options, which dictate how the system reacts in certain situations.  If each facility had it's own record type, users can find all records by searching for the specific record type.

Thinking about the features as described above makes a few things obvious to me:

  1. Locations should be used to store the name of the facility owners, the governmental bodies & agencies (federal, state, and local), and nominated users of this system.
  2. Classifications should be used to organize the types of information I'll be gathering (regulatory, facility, publications, and reference material records).
  3. Record Types should be used to structure relationships between the given records.

As the title of this post indicates, I need to focus in on that last thought: structure.  I could easily store the facility ID on each individual record; and, as a result, have a flat structure with no nesting or containment.  The record's meta-data registration form could require the entry of the facility ID, as shown below.

2017-09-14_11-44-55.png

This approach limits my ability to relate the document to a second facility though.  So this approach just wouldn't work for this situation.  Instead, I'd swap out the Facility Id field for the Container field and then add the facility ID onto the container, as shown below. I can use the alternative containment relationship (not shown here) to relate a document to multiple facilities.  

2017-09-14_11-44-55.png

I've established that each facility should have at least one facility container.  Is that enough though?  How do I distinguish between regulatory records and facility records? Classifications and record types are my two options here.  If I create a few classifications (known as Categories in the US Commons lexicon), then I can require each document attach a predefined classification.  An example of this approach is shown below.

2017-09-14_11-44-55.png

With this approach I, as the user, must pick this for each document.  That's a lot of extra work at the document level.  And as the above screenshot shows, there are going to be many different types of regulatory records.  If I don't want to have to pick it per document then I need to move that information (the classification) somewhere else.  If I move the field up in the structure that would mean adding the classification to the folder.  

2017-09-14_11-44-55.png

So now, during document import, I would find the facility container matching the type (classification) of the record.  Using the above screenshot I can see that I'd need at least three containers for each facility: regulatory, facility, and news records.  The common thing between them would be the facility Id.  Clicking on a point on the map should result in a search for all the matching containers, each containing 0+ documents.  

If I stop here and don't progress further with the design I would have a viable solution.  However, it requires me to always search in order to find all records for a given facility.  No Bueno!  I really would like to be able to find a facility and browse within it.  That means doing to the container the same thing I did to the document: move the facility Id up one level.

2017-09-14_11-44-55.png

Voila!  Each facility will be created as a box, which will contain folders.  Those folders will then contain documents.  This is the most common design approach within CM.  As the solution progresses it may be necessary to tweak the design... but for now I can move on!

National Mental Health Awareness week is coming!

In honor of the upcoming National Mental Health Awareness week, I'll be focusing my blog post on the build-out of a new SAMHSA records portal powered by Content Manager.  There are no requirements and I'm not doing this for monetary gain, so I have free reign to do whatever I want!  I want to build something that both demonstrates the flexibility of Content Manager whilst also providing functionality not yet available to the public.

My home state of Florida has what we colloquially call the Sunshine Law.  This law requires all public records to be made available to citizens.  It, unfortunately, isn't so easy to actually find these records though.  Let's see if we can make it easier.

My goal is to re-create this:

2017-09-14_11-44-55.png

This is a site run by the federal government, so it naturally doesn't have any records from my home state.  What I'd like to do though is to gather as many public records as I can for each facility.  Hopefully to include: applications, financials, complaints, and news articles.  I envision clicking on a facility and seeing all of the records I can find out in the wild.  If I can code it to have a public records request submitted for the facility, even better.

Upcoming topics include:

  • Designing a dataset -- what should I create to support the end-result? 
  • Importing from Spreadsheets -- how do I take this list of facilities and create a structure?
  • Culling google news -- what's involved with dynamically searching google news for a given facility and automatically saving articles into CM?
  • Exposing a map in Webdrawer -- how can I achieve the above image within Webdrawer?
  • Finding Complaints -- these aren't available without a public records request, so I need to build something to automate the submission of a request per facility... and then how to import those into my instance

There's so much more that can be done, but I feel this is a good topic for learning and development.  Stay tuned!