K12 Enterprise: Business Continuity and Disaster Recovery Planning
By Steve Langford, CIO, Beaverton School District
Restoration of systems and data took weeks of work by staff in IT, HR and the Business Office. One key learning from our disaster was that while IT had an existing Disaster Recovery (DR) plan, it was outdated and no longer a viable resource documenting the steps to recover data and systems from a catastrophic data center event. Additionally, other departments did not have plans to address their continued operations in the event of a disaster, pandemic, or loss of critical technology systems.
After passage of a bond including investments for technology modernization, we had the financial resources needed to address and improve our Disaster Recovery plan. Rather than just focus on IT systems and services, we decided to expand the scope from simply a DR plan within IT and include Business Continuity Planning (BCP), encompassing all departments and schools.
A Business Continuity/ Disaster Recovery plan looks across the organization into all learning and business units to create plans for operation in the event of a disaster. While the Disaster Recovery portion of the plan focuses on how IT will continue to provide services. Business Continuity planning examines the practices and functions of schools and departments to ensure continued operation during a disaster with potential impact to work facilities, staff availability, or access to data and systems.
Business Continuity and Disaster Recovery planning is complex and not an undertaking we were willing to begin without outside expertise. We engaged with a firm specializing in the creation of Business Continuity and Disaster Recovery plans. As this work would involve many staff from across the organization, an early step was to work with the senior leadership team to build understanding and support for the need and value-add of the process.
Our first steps involved meetings with all departments and schools. The purpose of this Business Impact Analysis (BIA) phase was to identify all existing processes for accomplishing tasks, systems utilized, and document the work done by staff.
The value-add of having documented processes and continuity plans for schools and departments offers considerable protection to the organization
The goals were to document all processes and begin to understand the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) for the systems staff rely upon to do their work.
The RTO is the amount of time agreed upon to restore a service or function. It is the amount of time a service can be unavailable without significant impact to the organization. The RPO refers to the amount of acceptable data loss to be incurred due to a disaster. Both the RPO and RTO vary with each service provided by IT. For example, our organization determined that the time to recover core services such as finance and human resources staffing information was very short, meaning the organization could not be without these services for more than a few hours. At the other end of the services provided, some reporting needs including specific data validation and facilities reports could be inaccessible for many weeks before significantly impacting the organization.
This level of detail informed the next steps in our planning process, which was to conduct a gap assessment to evaluate current service levels against the desired states uncovered in the Business Impact Analysis phase. IT staff needed to evaluate actual recovery times against need, backup strategies and backup frequency to ensure RPO objectives could be met.
While the gap analysis is primarily an IT specific exercise, departments and schools were actively engaged, simultaneously developing Business Continuity Plans. These plans contain forms, workflows, tasks, vendor and other contact information needed by a school or department in the event of a disaster. Within the IT department, the task was to create a recovery plan addressing any gaps between existing capabilities and RTO/ RPO objectives.
The senior leadership team was again engaged to review the plans and their responsibilities in the event of a disaster. In addition to specific logistical duties around disaster declaration and facility needs, they are also charged with overall plan management and upkeep of the plans.
There were a number of lessons learned in our exercise of leading a K12 enterprise through the Business Continuity/Disaster Recovery process. The first is that while it can be challenging to convince department leaders to allocate staff time to this process, the knowledge gained from documenting processes and expectations of systems availability is extremely valuable. Agreement upon needs and priorities helps staff understand the time systems could be offline, priorities for bringing services back online, and what alternate workflow they will use to accomplish their work. Our move from just an IT-centric Disaster Recovery plan to a comprehensive Business Continuity Plan increased awareness and understanding throughout the organization.
Another important learning from the process was that as IT staff worked through the Business Continuity Planning with staff from other departments, IT staff gained a deeper understanding of the work and needs across the organization. This increased understanding of our business will help IT staff assist in a more strategic manner as they work with staff to continuously improve workflow and processes in schools and departments.
Business Continuity/Disaster Recovery planning is a complex and intensive process for an organization. Many K12 staff in schools and departments are overwhelmed with work and to ask them to engage in this exercise is not an insignificant undertaking. Yet for all of the challenges with working through the BCP/DR process, the value-add of having documented processes and continuity plans for schools and departments offers considerable protection to the organization and drastically increases the likelihood of the continued education of our students should a disaster occur.