May 18, 2023, by Paola Reyes
“A data architecture describes how data is managed from collection through to transformation, distribution, and consumption. It sets the blueprint for data and the way it flows through data storage systems. It is foundational to data processing operations and artificial intelligence (AI) applications.” (IBM, 2022).
Netflix:
This organization is a subscription-based streaming service that allows members to watch movies and TV shows while they are connected to the internet. Netflix data architecture starts with the data source that is the subscribers’ activities or interactions in the service and profile information, such as the genre of the movies, the time spent each session, movie runtime, location, age, and others (NetflixData, 2017).
Figure 1. Netflix Data Architecture (NetflixData, 2017). This data is written to an ingestion pipeline called Kafka and it lands into the data storage on Amazon Web Services, which is their data warehouse. For data processing, they have more than one technology such as spark, pic and, Jupiter (NetflixData, 2017). Then, a subset of this data is moved to a fast storage where the data is available for the data visualization for data analytics and reports. The visualization & interactive query tools include Tableau, presto, Spark, R and Python.
The reports and visualizations are available for the managers, or the boss to take an action based on the analysis obtained (Perez, 2017).
Starbucks:
Is an American multinational chain of coffeehouses and roastery reserves. In 2020, the company had 32.660 locations in the world (FinancesOnline, 2021), which can be an indicator of their good service and the use of data to analyze their consumers, understand them and take decisions to maximize profits. In the following graph, the data architecture of Starbucks costumer is shown. It was created by research; the references are bellow.
Starbucks' data sources are internal and external. For the internal, they use their app to get information of the customer age, location, preferences from their menu, days where they go to the store more often, reviews, and others. On the other hand (external sources), Starbucks uses the social media platforms to capture the publications about the store, where the post was uploaded, the ages of those users, their reviews in platforms such as Google maps, and more (Stephane, 2021). This data is storage into databases such as Amazon web services, and Azure CosmosDB (AWS Events, 2021). The next step, data processing, where data cleaning, data transformation, exploratory analysis and modeling are done through Java, python, and other tools (LinkedIn, 2022). Then, data analysis where the report, and the visualizations take place to be delivered to the marketing and CEO office. One of the tools used by the company is Tableau (Tableau, 2021).
References
International Business Machine Corporation. (2022). What is data architecture? IBM. https://www.ibm.com/topics/data-architecture#:~:text=A%20data%20architecture%20describes%20how,artificial%20intelligence%20(AI)%20applications
Ajoy Majumdar, Zhen Li,. (2018, June 14). Metacat: Making Big Data Discoverable and Meaningful at Netflix. Netflix TeachBlog. https://netflixtechblog.com/metacat-making-big-data-discoverable-and-meaningful-at-netflix-56fb36a53520
NetflixData. (2017, October 16). Delivering High Quality Analytics at Netflix. [Video]. https://www.youtube.com/watch?v=nMyuCdqzpZc
Perez, Ross. (2017, February). How Netflix built its analytics in the cloud with Tableau and AWS. Tableau. https://www.tableau.com/blog/tableau-cloud-netflix-original-64442
FinancesOnline. (2021). Number of Starbucks Worldwide 2022/2023: Facts, Statistics, and Trends. https://financesonline.com/number-of-starbucks-worldwide/
Stephane, V. (2021, March 22). Starbucks: From Coffee Machines to Machine Learning. Harvard | Digital innovation and transformation. https://d3.harvard.edu/platform-digit/submission/starbucks-from-coffee-machines-to-machine-learning/
LinkedIn. (2022, December 30). Job at Starbucks | senior software engineer - Starbucks Technology(Seattle OR Remote). senior software engineer - Starbucks Technology(Seattle OR Remote) | Starbucks | LinkedIn
Tableau. (2021, January 2021). The Future of Analytics: Starbucks and Tableau—Adapting to Drive Innovation and Connection. https://www.tableau.com/community/events/nrf-2021
TableauWebinar. (2021, January 2021). The Future of Analytics: Starbucks and Tableau—Adapting to Drive Innovation and Connection. https://www.tableau.com/learn/webinars/future-analytics-starbucks-and-tableau-adapting-drive-innovation-and-connection
AWS Events. (2021, December 17). AWS re:Invent 2021 - How data is driving the future growth of travel and hospitality. [Video]. https://www.youtube.com/watch?v=otmTG1zbb6Q&t=327s
Comments