April 10, 2023, by Paola Reyes
As know, the company has several big data projects in process that should follow policies, standards, management, and procedures to ensure the formal management of the data in the organization.
The following report has been created to identify the components of the data governance plan that include best practices and methods that will create the foundation for the safety of our data, and current and future projects.
Figure 1. Data Governance Components
In Figure 1, the components that are part of the governance plan are shown. They will be covered with their definition and support in the big data projects.
Data strategy:
A long-term plan that outlines the technology, procedures, staff, and guidelines necessary to manage the information assets of a company. It is important to define for each data source, the following:
· What data is needed?
· What sources are going to be used?
· What are going to be the filters to select the best data source?
· What are the goals of the data and project?
· What are the objectives for each project related to the data?
Data policy:
Laws and/or formal regulations governing the proper gathering, use, sharing, and management of data inside an organization are known as data policies. The following should be defined:
· Data access
· Data use
· Data storage
· Limit data use
Data standards:
A data standard is a technical definition that specifies how data should be collected, transferred, and kept ensuring consistency across systems, sources, and consumers (Resources DataGov, Data standards). To ensure the quality of data, the following attributes should be included in the data:
· Data Type
· Vocabulary
· Versioning
· Format
· Storage
Data dictionary:
Based on the big amount of data and the projects, it is important to have a description of the information collected, called metadata. This should include the following:
· Data names
· Data types
· Definitions
· Origins
· Relationships
Data privacy:
This element deserves adequate consideration. Due to the requirement for policies and limits, it is imperative to make sure that only authorized users have access to sensitive data. It is important to ensure the following:
· Effective policies and communications to those who are in contact with it.
· Data encryption
· Restrict data access
· Limit data sharing and release.
Data security:
Is the activity of securing corporate data and preventing data loss as a result of unauthorized access. This requires safeguarding your data from attacks that might encrypt or wipe out data, such as viruses, as well as attacks that could change or damage your data (Imperva, Data). To ensure data security, the following procedures should be followed:
· Backing up the data
· Physical security
· Encryption
Data integrity:
This makes sure that the same data is available at different locations. It maintains completeness, consistency, and accuracy. This component ensures that:
· There is no missing data
· Data lineage and access tracking is available
Data quality: to be able to make important decisions, the quality of the data should be high. The measure of it is based on completeness, consistency, accuracy, and other attributes. To ensure the standards of high quality, the data must be:
· Clean through effective methods
· Check for usability, format, and others through a data audit
Data availability:
This component will ensure that data is going to be accessible whenever is needed. The following are some of the methods that can be used to ensure data availability:
· Redundant CPUs
· Redundant connections
· Backup/archival systems
Data governance technologies: Technology-based data governance solutions offer the tools for developing and controlling business and data vocabularies, monitoring data lineage, and preserving rules and policies. Some of the commercial software that can be used are:
· Collibra
· Informatica Axon
· IBM InfoSphere
Data stakeholders: all the components above are key for data governance, but we also need a person in charge of this process. It is needed to assign people in the company for specific data governance job positions such as:
· Data owners
· Data Stewards
· Data custodians
For the action plan for each of them, a specific list of actions and processes should be provided as well as the schedule and the person responsible for the execution of each task. The data governance plan described above is essential for the company and its success.
References
AWS. What is data strategy? https://aws.amazon.com/what-is/data-strategy/#:~:text=your%20data%20strategy%3F,What%20is%20a%20data%20strategy%3F,amounts%20of%20raw%20data%20today.
Resources Data Gov. Data Standards. https://resources.data.gov/standards/concepts/
Imperva. Data security. https://www.imperva.com/learn/data-security/data-security/#:~:text=Data%20security%20is%20the%20process,modify%20or%20corrupt%20your%20data.
Hadi Rezazad. Big Data Governance. Ait-622 Big Data Needs Analysis. PowerPoint Presentation.
Comments