![]() ![]() The WORKSPACE_ should be selected for this and not the SPECIFIC_ schema. The WORKSPACE_ schema can also be used as a waypoint for models that need an EDM solution in the COMMON_ schema. This schema is where data modelers can experiment and prototype data solutions. WORKSPACE_: This schema follows the Ad-Hoc Data Development process. ![]() This helps us prevent technical debt from building up in the SPECIFIC_ schema. The SPECIFIC_ schema is not a waypoint for models that should be modeled in the COMMON_ schema however, it could be possible for models in this schema to mature and require an EDM solution. A key acceptance criteria for models entering this schema is that the models are built from application data that is not integrated with other application data. The data models in the SPECIFIC_ schema follow the Trusted Data Development process with the exception of a Dimensional Modeling methodology not being required. The EDM in the COMMON_ schema excels at modeling data from Integrated Enterprise Applications however, not all application data is integrated nor requires the rigor of Dimensional Modeling. SPECIFIC_: This schema is where data from applications that do not integrate with other applications lives. This prevents a confusing user experience of having data from a source system modeled with different methods and patterns. If some data tables qualify for the COMMON_ solution, then all data from that source system should be modeled in the EDM. Foreign keys in a data table from a different source system is a strong indication that a COMMON_ solution is required. Our Enterprise Applications that integrate with each other are great candidates for the Kimball Dimensional Modeling Methodology. Dimensional Modeling follows our Trusted Data Development process and enables us to repeatedly produce high-quality data solutions. The data model follows the Kimball technique, including a Bus Matrix and Entity Relationship Diagrams. The EDM is GitLab's centralized data model, designed to enable and support the highest levels of accuracy and quality for reporting and analytics. Below are descriptions of each Schema:ĬOMMON_: This schema is where our Enterprise Dimensional Model (EDM) lives. It is composed of 4 major schemas which are COMMON_, SPECIFIC, LEGACY_ and WORKSPACE_ schemas. The Production Database in the EDW is used for reporting and analysis by Data Consumers at GitLab. The Data Catalog contains Analytics Hubs, Data Guides, Data Dictionaries, and Analysis for the data models built in the EDW. We use Snowflake as our EDW and use dbt to transform data in the EDW. We use an ELT method to Extract, Load, and Transform data in the EDW. It is a central repository of current and historical data from GitLab's Enterprise Applications. The Enterprise Data Warehouse (EDW) is used for reporting and analysis.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |