Case Study Details

Embracing the Future: Tarleton State University’s Data-Driven Transformation

Implementing a data-driven growth-oriented strategy requires that information should be centralized and readily available. Tarleton University's data was siloed, with no centralized reporting using various data sources. CAG helped build a centralized data warehouse, enabling the successful implementation of Tarleton's Forward Strategy.

Tarleton State University's new leadership wanted to implement a long-term, data-driven growth-oriented strategy called Tarleton Forward. The Tarleton Forward 2030 is a strategy requiring data-supported and informed decisions. Columbia Advisory Group (CAG) determined that Tarleton had the disparate raw datasets necessary to support the objectives; however, the organization required a new approach to be successful. The primary focus thus became implementing a modern solution in the form of Microsoft Azure-based data warehousing and reporting.

The University's Data Challenge

Vision and Focus: Tarleton has a capable IT team for operations. There is an excellent Institutional Analytics team concentrated on meeting regulatory requirements. As is common in organizations early in their data analytics journey, the focus, direction, and ability to make Tarleton Forward 2030 happen were not yet established.

No Centralized Data Infrastructure: The university consisted of a combination of on-premises databases, a growth in cloud-based solutions, and multiple reporting software products. Thus increasing the number of data sources and diversity of data. The outcome is that data across systems were distributed across various systems, and standard data definitions made reporting inconsistent and challenging at best. Extracting data from sources and then manually consolidating it into spreadsheets is common.

When two systems capture the same data, determining the source of truth is problematic. Data extraction for aggregation from source systems could vary by each method. The other challenge with premise-based systems is the need to procure hardware and software to meet the peak demand. This can mean provisioning 80% or more than required for day-to-day operations creating unnecessary costs.

Data Governance: The data silos across the university lacked structured governance. This arrangement hindered data aggregation, determination of causal factors, and extracting of valuable, workable insights from the data. The lack of standard definitions of terms, common data definitions, and inconsistent data types or fields introduced discrepancies in aggregation. Discrepancies in data definitions can impact data integrity. In addition, the data teams could not access modern reporting tools without spending undue capital.

Hindered data analysis: The combined issues of disparate data sources without governance matters created for research. Finding the correct data is highly challenging when a more detailed analysis is required. The absence of standardized, aggregated data put the issue back on the department or individual to figure it out independently. The impact is a decrease in efficiency, having to solve the problems at the individual level.

PROPOSED SOLUTION BY CAG

Columbia Advisory Group partnered to propose a workable, cost-effective solution to Tarleton's assimilating and analyzing data issues. The solutions involved implementing a modern data infrastructure and included the following measures:

Leveraging the benefits of the cloud
CAG proposed that Tarleton leverage the speed and agility of cloud platforms for storing and analyzing their data. Tarleton benefitted from a future-proof cloud platform that could meet current and future information-related requirements. CAG worked with Tarleton's IT team with focus and direction on their existing cloud skills to shorten the implementation timeframe and leverage existing talent. The benefits of a cloud-based solution include the following:

a) dynamic scalability for performance and storage
b) scale systems up or down as needed for cost avoidance and better expense management,
c) a central structure for managing privacy and security

Building a data warehouse
CAG suggested a cloud-based data warehouse for storage and analysis, providing accurate, consistent, and relevant data. This approach would enable a common source of truth for all users and mitigate integrity or data source issues. Data access is then enabled through standard reporting software. Since the university had prior experience working on the Microsoft Cloud platform, they selected Microsoft's Azure Cloud Platform for implementing the proposed data warehousing solution.

Leveraging Existing Technology Skills
Tarleton University's IT team had prior experience with Microsoft Azure Cloud, thus saving on the expenses incurred and time for training and setting up a new cloud platform. The IT team was well-versed in the Azure cybersecurity protocols, Microsoft Azure Sentinel, and Microsoft 365.

Overcoming Data Security Challenges
The primary benefit of working on a known cloud platform was that the IT team was familiar with Microsoft Azure Sentinel and Office 365 for staff, students, and faculty. Such deep pre-existing knowledge removed hurdles concerning security architecture. The university's database team also had considerable experience working with Microsoft-based technologies such as T-SQL, Azure, and PL/SQL.

Leveraging Core Azure Tools
The architecture established by CAG was simple and leveraged native solutions where possible:

  • The Azure Data Factory extract-transform-load (ETL/ELT) software is used for ingesting data from on-premises and vendor-hosted cloud systems.
  • Azure Storage was used as a Data Lake for semi-structured, unstructured, and transient data, while Azure SQL server was used for storing structured data.
  • Azure Purview Data Catalog was formed as the foundation supporting data governance. Purview created an up-to-date data landscape map with automated data discovery, sensitive data classification, and end-to-end data lineage.
  • Azure Application Programming Interface (API) Management Services provided a secure, standard interface for exposing selected data to external partners and applications. This facilitates a common data access mechanism reducing reliance on multiple solutions to access data across the source systems.
What is Data Warehousing?

A data warehouse is a large, centralized data repository specifically designed to support business decision-making activities. It collects data from various sources, such as transactional systems, databases, and other sources, and then stores and organizes that data to make it easier to access and analyze. The data in a data warehouse is typically structured, organized, and optimized for query and analysis, allowing decision-makers to gain insights and make informed decisions based on the data. It is optimized for querying and reporting rather than transaction processing, which is the primary purpose of operational databases.

Putting data in a data warehouse is typically divided into four main stages: extraction, transformation, loading, and verification. Here is an overview of each stage:

  • Extraction: The first stage is to extract data from various sources such as transactional systems, operational databases, cloud-based platforms, and other sources. This data is extracted based on predefined rules and criteria.
  • Transformation: The extracted data is then transformed into a format that can be loaded into the data warehouse. This process involves cleaning, aggregating, and integrating the data to ensure its accuracy and consistency.
  • Loading: The transformed data is loaded into the data warehouse, which involves moving it from the staging area to the target database. This can be done using different loading techniques, such as full or incremental loads.
  • Verification: Finally, the loaded data is verified to ensure that it has been loaded correctly and meets the organization's required quality standards. This involves validating data completeness, accuracy, and consistency and resolving any issues that may arise during the verification process.

Data warehouse metadata describes a data warehouse's structure, contents, and usage. Metadata is used to manage and control the data warehouse and to support data integration, quality, and governance.

  • Data dictionary: A data dictionary is a document that describes the data elements used in the data warehouse, including their names, definitions, and relationships.
  • Data lineage: Data lineage is a record of the history of data as it moves through the data warehouse, showing where the data came from, how it was transformed, and where it is stored.
  • Data models: Data models are diagrams that show the structure of the data warehouse, including the tables, fields, and relationships between them.
  • Business rules: Business rules are the policies and procedures that govern the use of data in the data warehouse, including data security, privacy, and access.
  • Data profiling: Data profiling analyzes the data in the warehouse to understand its quality, completeness, and consistency.
  • Data integration: Data integration metadata is used to manage the process of extracting, transforming, and loading data into the data warehouse
Adopting a Multi-Staged Approach

Any change in work methodology requires time for stakeholders to get used to the new functions. To reduce users' challenges, Tarleton University focused on implementing a multi-stated implementation approach to creating a centralized data warehouse. The staged approach enabled its staff to learn operations without becoming overwhelmed. It also empowered the use of the new solution immediately without devoting stakeholders' time to learning new tools.

The First Stage - Creating an Azure Infrastructure
The Azure infrastructure was established and quickly configured ingestion of the required core datasets. The initial data included Ellucian Banner and Degree Works data from the Oracle-based, on-premises servers. In addition, ingestion jobs were created for cloud-based data sources from Salesforce, Target-X, and Canvas LMS. The Azure Data Factory produces data pipelines to securely access each system with scheduled ingestion jobs from a single management tool.

The Second Stage - Reproducing the Reported Datasets
Initially, the solution reproduced the reporting data sets on-premises overnight by copying the Ellucian Banner database. However, this procedure takes time and effort. The second phase improved the process by updating data sets in Azure and then reproduced the specific reporting (analytics) data sets used by Institutional Analytics. This process saved time and created access to the data via the cloud and standard reporting tools.

The Final Stage - Making Data Available for Analytics and Reporting
The end-user can access the data and metadata, providing information about the structure and content of the data in the data warehouse. This reduces the reliance on a technical person or DBA to locate data and relationships. Additionally, the metadata can provide insight and prioritization of additional data sets to add to the ingestion process to aggregate information for analysis.

Summary

Thus, CAG working with Tarleton University helped design information systems around protecting the confidentiality, integrity, and availability of their valuable data assets. University data is now securely available for analysis and making informed data-driven decisions. The university can now let its data inform decision-makers about what is happening and what they can do to help students, retain students, and improve their experiences. Tarleton Forward also empowers students to stay students for their educational careers and helps the university understand what happens after departure and how to stay connected to alums.

Tarleton's new data warehouse solution is realizing the following benefits:

Cost reduction: The cloud-based, centralized data warehouse solution required a fraction of this cost versus an on-premises solution. The university's team could manage the system without additional staffing requirements.

Quick Implementation: The initial implementation took five weeks to complete.

Improved Access to Data: Tarleton's institutional analytics team realized quick access to correlated data. Further, Tarleton can store recorded data over time. This can be useful for trend analysis, forecasting, and identifying patterns and anomalies over time.

Enhanced and Meaningful Reporting: The reporting capabilities expanded the types of reports available for analysis. Providing a concrete platform for future AI/ML work is another benefit.

Improved Mobile App: Azure's API management services provide secure access to selected data for partners and internal use. One key benefit is the ability of the Tarleton mobile app to provide users with a better personal and customized experience.

Centralization of Data: The centrally accessed and secured cloud-based data services provide access across all Tarleton campuses.

Leveraging Automation Capabilities: Manual tasks requiring datasets for programs that took 6-8 weeks for students could now be automated. This reduced human effort and time commitment.

Deep Analysis: The solution enables the deep analysis and correlating of trends and unstructured/semi-structured data to develop prediction capabilities based on the enhanced data sets.

 

ABOUT CAG: CAG is a highly experienced IT consulting firm. With 100+ years of combined technology experience and business acumen, CAG’s team has assessed and helped improve the performance of more than 300 technology organizations and IT departments. By focusing on simple, meaningful, and practical solutions combined with straight-for ward analysis and recommendations, CAG’s team has experience in many regulatory and economic environments with companies and organizations of all sizes. CAG not only offers a deep understanding of IT, but its solutions are software and hardware agnostic. Whether a client is high growth or economically challenged, CAG can adapt to the complexities and nuances of that business. Based in Dallas, Texas; Columbia Advisory Group works extensively with clients throughout the United States. For more information, visit columbiaadvisory.com