Top Secret: A Guide to Implementing Data Classification in Environments

When I was 12 years old, on one occasion at my best friend’s family dinner, everyone went around the table talking about their day and what they did, except for my friend’s father. Curious why he hadn’t said anything, I asked him what he did for work and how it was going. After a short laugh from everyone else at the table and a brief pause he simply said, “It’s classified.” Now what I didn’t know, and wouldn’t know for another 13 years, is that my best friend’s dad had retired from being a mechanical engineer that worked for GE Aviation. Specifically, his focus was the development and testing of new aircraft jet engines, mostly for the military. This was my first exposure to data classification and honestly, at the time it didn’t make any sense. After all, I was a 12-year-old kid at the dinner table, what was I going to do with that information? Now, working in cybersecurity, I understand the use and importance of data classification and in this article, I will cover what data classification is and how we can best use it in our K-12 environments.

In its simplest form, data classification is the process of breaking down and organizing data into different categories so it can be efficiently and effectively protected. The process allows data to be searchable and trackable, assists in removing duplicated data and maintains an excellent CIA triad. Other than sounding like a gang the Teenage Mutant Ninja Turtles would encounter, CIA triad is an acronym for the triad of Confidentiality, Integrity, and Availability. Confidentiality refers to keeping data private or secret, Integrity is about ensuring that data has not been tampered with and, therefore, can be trusted, while Availability ensures that authorized users have timely, reliable access to resources when they are needed. The CIA triad is a very important concept in cybersecurity as a whole, specifically in understanding and utilizing data classification.

To classify our data we can use four different categories to separate that data based on the CIA triad. It is important to note that data classification categories can vary from industry to industry and even company to company but usually can be broken up into the following four general categories; Public, Internal, Confidential, and Restricted. An article from Spirion, a leading data management company in the information security space, does an excellent job in spelling out what each category means and provides real-world examples for each category for better understanding. Public refers to information that is freely available and accessible to the public without any restrictions or adverse consequences, such as marketing material, contact information, customer service contracts, and price lists. Internal refers to data with low-security requirements, but not meant for public disclosure, such as client communications, sales playbooks, and organizational charts. Unauthorized disclosure of such information can lead to short-term embarrassment and loss of competitive advantage. Confidential refers to sensitive data that if compromised could negatively impact operations, including harming the company, its customers, partners, or employees. Examples include vendor contracts, employee reviews and salaries, and customer information. Restricted refers to Highly sensitive corporate data that if compromised could put the organization at financial, legal, regulatory, and reputational risk. Examples include customers’ PII, PHI, and credit card information.

Each one of these categories also has its own set of rules and guidelines based on data sensitivity and potential impact if compromised. One very important guideline is who can access each category of data. These are called ACLs or Access Control Lists. These lists specify who can access what data and when they have permission to access that data. For schools, oftentimes these ACLs are broken down by position and tied to each person’s job title and description. For example, a teacher may only be able to view internal data while a principal of a school may be able to view confidential data. However, it is important to remember that just because a person has a higher position or title does not automatically give them the right to access more data than anyone else. This idea is called the principle of least privilege and means that a user is given the minimum levels of access or permissions needed to perform his/her job functions. So although a Superintendent may have access to restricted data that does not mean that they should have permission to access all restricted data, like payroll information for example.

Before we start stamping everything with a big red TOP SECRET stamp we have to start asking ourselves some basic questions about data in our K-12 environment. The data classification policy should consider the following questions: Which person, organization, or program created and/or owns the information? Which organizational unit has the most information about the content and context of the information? Who is responsible for the integrity and accuracy of the data? Where is the information stored? Is the information subject to any regulations or compliance standards, and what are the penalties associated with non-compliance? All of these questions are critical to answer before any data classification can formally begin. Once these questions are answered, they will guide us as we start the process of truly classifying our data. 

To help us place the data into the appropriate categories we can use a chart created by Carnegie Mellon University based on the CIA triad and the potential impact if one or more of the CIA triad is compromised. Then we can break things down into categories of limited, serious, or severely compromised. The chart reads as follows; Confidentiality: The unauthorized disclosure of information could be expected to have a limited, serious, or severe adverse effect on organizational operations, organizational assets, or individuals. Integrity: The unauthorized modification or destruction of information could be expected to have a limited, serious, or severe adverse effect on organizational operations, organizational assets, or individuals.

Availability: The disruption of access to or use of information or an information system could be expected to have a limited, serious, or severe adverse effect on organizational operations, organizational assets, or individuals.

So all this information is great in theory but how do we apply it to our districts? Let’s look at some examples to help better understand how this classification works and as a result be able to apply this to all data across the organization. For our first example, we will examine detailed disciplinary policies and procedures. Generally, most policies and procedures are open for public viewing and are often even given to the students and/or staff in some written form. We would tend to classify these as public but there are also usually more detailed procedures for administrators in employee handbooks that spell out additional guidelines on how to handle behaviors, incidents, or even when to escalate any events to another administrator or 3rd party such as law enforcement. If we examine these policies and procedures with our impact chart we can determine that the unauthorized disclosure of this data could have a limited impact on the district as it may cause short-term embarrassment but the duplication of this data doesn’t have an impact on day to day proceedings. The unauthorized modification of this data could have a serious impact on the district as modification could confuse an incident or information to be improperly disclosed to another party. Finally, the disruption of access to this data could have a serious impact on the district if an incident does happen and the proper steps are not taken. If we evaluate these impact ratings with the proposed scenarios we would more than likely categorize this information as Confidential and take the necessary steps to protect the data.

Another example to look at is employee payment and banking information. Just taking the data at face value and not diving into any specific examples we can see that the unauthorized disclosure of this data could have a severe impact on the individual, the unauthorized modification of this data could also have a severe impact on the individual, and the disruption of access to this data could have a severe impact on the individual. From these impact ratings, we can say that this data would fall under the Restricted category and take the necessary steps to protect the data and restrict access to only those who deal directly with the data such as the Payroll department.

So now it’s time to evaluate your district! Using the techniques we have talked about in this article you can now start the process of classifying the data in your district with a good understanding of the basic processes and techniques needed. This can be a very big undertaking and it is highly advised, if not critical, that you work with district administration and leaders to build not only your data classification model but also the policies and procedures around how existing data is classified, what to do with unclassified new data, and the ACL’s that exist inside of your district. Building data classification into your district’s operations will not only help with day-to-day operations but will also strengthen your district’s cybersecurity posture now and into the future.

Schedule a meeting with us today to strengthen your district’s Cybersecurity practices!