Thank you for Subscribing to CIO Applications Weekly Brief
Building a Data-Centric Ecosystem
Michael Thieme, Senior Advisor to the Deputy Director for IT and Operations, U.S. Census Bureau
The U.S. Census Bureau has historically helped answer both simple questions like "What’s the population of Utah?” and more complex questions like “How are declining business start-up rates related to living standards?” We have done this by conducting censuses and surveys and publishing the results. But in these challenging times, U.S. residents hesitate to respond to surveys. Many are reluctant to open the front door or answer a phone call or text from an unfamiliar number. These are bread-and-butter issues for a survey-taking agency. Our changing culture and rapid changes in data and technology tell us that censuses and surveys alone, while still critical, can no longer answer society’s questions completely or quickly enough to satisfy the modern appetite for information. In this article, I describe modernization efforts underway at the Census Bureau that combine data-science with traditional survey methods to diversify our data products and place data at the center of our approach.
Building a Data-centric Ecosystem that Handles the Change
To shift in the direction of a data and data product focus, the Census Bureau has created four integrated enterprise-wide initiatives – the four pillars of the new ecosystem:
1. The Enterprise Data Lake (EDL) - The central hub of our modernization efforts from a cloud computing and data processing perspective.
2. Frames – Provides a means for maintaining and updating the inventory of addresses, jobs, businesses, and other linked data.
3. Data Ingest and Collection for the Enterprise (DICE) – Provides a modern platform for both field data collection and data ingest.
4. Census Enterprise Dissemination Services and Consumer Innovation (CEDSCI) – Provides the primary platform to serve up statistical data to the public.
The Census Operations and Data Ecosystem
We are integrating the four pillars described above into a unified enterprise approach to doing business that we call the Census Operations and Data Ecosystem (CODE). CODE makes it possible to provide easily discoverable and linkable data to answer more questions faster and more accurately than ever before. It also represents a key element in the Census Bureau’s strategy to anticipate and prepare for a world driven by and dependent on accurate, timely, and relevant data.
Building a modern ecosystem via the four pillars presents many challenges, considering the Census Bureau’s historical approach to doing business. In our planning for this transformational effort, we have identified seven key challenges as well as our plans (in the subsequent section) to tackle them.
1. Siloes of Excellence
2. Management Vision and Commitment
3. Proprietary IT Systems
4. Technology Skills Deficit
5. Rising Costs
7. Perceptions of History
Tackling our Challenges
Owning our challenges is the first step to dealing with them. But as the societal need for more timely, high-quality products intensifies, this transformation effort also provides the Census Bureau with a distinct opportunity to break down the barriers and streamline processes across the entire organization. In this section we briefly describe how we’re tackling each of the challenges shown above.
To break down our siloes, theCensus Bureau Director, Deputy Director, and Associate Directors are providing the focused and unified leadership necessary to guide the significant internal change described in this plan to completion.
Response to Challenge 2 (Management Vision and Commitment):
This response has five components:
1. Establish a Portfolio Executive
2. Establish aBureau Leadership Team (BLT)
3. Implement “True North” Guidelines
True North helps align the paths thatlead to new and innovative methods and products based on a “One Census Bureau” approach
4. Streamline Governance
5. Implement Rigorous Phase Gate Reviews
Response to Challenge 3 (Proprietary IT Systems): Migrate to Open-Source Software
The Census Bureau has developed many of its current systems using closed-source, proprietary products.Open-source technologies will allow us a powerful and scalable platform upon which to build our future, particularly as we move into a data-centric, data science-based paradigm.
Response to Challenge 4 (Technology Skills Deficit): Increase Skill Sets Internally/Contract When Required
To provide the expertise necessary for this integration effort, the Census Bureau CIO is employing new strategies to attract staff with appropriate expertise in several different ways:
• Identifying and reassigning existing federal and contractor staff with relevant skills and abilities to work directly as part of a new Secure Cloud Team (SCT)
• Opening additional training opportunities for federal staff
• Working closely with existing contract Project Managers to ensure new contract staff possess the necessary skillsets and training
Using these strategies, staff and contractors are assigned to the new enterprise Secure Cloud Team.The SCT transforms the former model of a siloed support and reduces the organizational, process, and management bottlenecks that led to slow implementation of process improvements and customer hesitation to adopt cloud services.
Response to Challenge 5 (Rising Costs): Implement New Cost Strategies
Over the long term, consolidation of enterprise capabilities and the retirement of siloed legacy systems will reduce both the Census Bureau’s IT footprint and its IT costs. However, short- to mid-term IT costs in support of the transition will increase.
Response to Challenge 6 (Cybersecurity): Implement New Security Strategies
Securing our systems and data is critical to the success of the Census Bureau and to integratingthe four initiatives. The public’s trust is in our hands to ensure our ability to produce high quality federal statistics, so security planning is spread throughout the ecosystem from the beginning to form an enhanced set of safeguards for our systems.
Response to Challenge 7 (Perceptions of History): Don’t Lose Sight of Difficult Lessons Learned and RecognizeKey Differences from Recent History
When considering recent history,there are fundamental differences for this effort, including timing and approach. The Decennial Census always looms large in determining the fate of any enterprise-wide effort. After all, the Decennial Census is the biggest thing we do despite over 100 other surveys and censuses that take place throughout the decade.So timing is important.
So What Does This All Look Like?
In our shift toward a data-centric modelour architecture must also shift. As depicted in the high-level diagram below, the EDL is at the center of the architecture. DICE forms the “inputs” to the lake – data collected from respondents and data ingested from third-party sources. Meanwhile, ingested raw dataare made available within EDL, allowing users to perform research and to create products with the latest data available from our providers. In our end-state architecture, the Frames program is wholly contained within EDL, including the foundational frames. This allows the frames to be equally accessible to other data and to be combined with collected data in new ways to form new products.
Transition and System decommissioning- A key part of the challenge of transition to modern computing is the decommissioning of legacy systems.
Finalizing data standards- Data standards are critical to implementing the data-centric, innovative approach this plan envisions.
The 2030 Census - As 2030 planning ramps up through the decade,ecosystem planning will follow suit with increasing Decennial milestones.