Originally introduced to describe the problem of information management, the term "data warehouse" has become one of the most frequently used, in terms of information technology. But if you ask the question "what is data storage and how it should be organized, vendors and professionals, quickly become apparent ambiguity of the term.

For many people the data warehouse - is a certain set of data combined from various sources, structured and optimized for access by means of a query OLAP (on-line analytical processing - online analytical processing). This opinion was originally distributed suppliers of OLAP. For other data warehouse - it is actually a kind of database containing data on more than one source, collected for the purposes of information management. This definition is neither helpful nor obvious, since such databases were used for decision-making long before the emergence of the term "data warehouse".
Concept of "storage" arose, at least in the mid-1980s or even earlier. And, in fact, intended to describe the architectural model of the flow of data from operating systems to decision support tools. This model is responsible for various tasks associated with this flow and the associated high costs. Without such an architecture transmitted control information typically contains a large number of redundant data. In large corporations, multiple projects decisions are usually carried out independently, each serving different users, often using the same data. The process of collecting, cleaning and integrating data from different, often inherited, usually dubbed the sources for each project. Moreover, existing systems were visited again at each new request differs from the previous design often only data.
By analogy with the real repositories in data warehouses are large areas for data collection / storage / transfer of existing data, where data can be redistributed to "retail stores or data marts, which is precisely designed for access by users, decision-makers. While the data warehouse is designed to manage data coming in in large quantities from their suppliers (such as operating systems), as well as for organizing and storing these data, retail stores or data marts can focus on packaging and supply data sets to end users often to meet specific needs.
At some point, this analogy and architectural vision has been lost, largely under the influence of suppliers of software for decision support. Leading experts in the field of data storage, appearing in the late 80's, as a rule, were directly related to such companies. The architectural vision is often replaced by research on how to design a database to support decision-making. Suddenly the data warehouse became a panacea for the headaches in the organization of decision support, and suppliers are jockeying for position in the thriving market of data storage.
Despite the fact that lately the term "storage" is increasingly associated with OLAP and multidimensional database technology and some people believe that the data warehouse to be built on the star-schema database structure, it would be prudent to limit use of these schemes to data mart. Using a star schema or multidimensional / OLAP framework for data warehouse can (in practice) seriously compromised his role in a number of reasons:
- Such a structure assumes that all requests to the data store will have a quantitative nature, ie queries on numerical data. This ignores the fact that storage could well serve as text files or qualitative data, such as information about the full spectrum of buyers in the collection of profile information from a wide range of sources;
- Such a structure requires the prior association of data in the repository. With this merger, and exclusion of many business data, much of the information may be lost. In the case of changing requirements for the information requested, need other combinations of business information, resulting in a star-shaped or multi-dimensional structure will soon become useless. On the other hand, the normalized (ordered) structure containing the data of the business level, can provide any alternative combination of data. Although some data to create such a structure is not possible, due to space constraints and / or performance, you should not refuse to maintain a low level of business data in a data warehouse. As is often the only way to prevent the need for relevant information in the future;
- Optimized models, such as a "star" is usually less flexible compared with normalized structures. Normalized model easily rebuilt in the event of changes in regulations or business requirements.
Data mart provides the ideal solution, perhaps the most significant conflict in the design of data warehouses - performance versus flexibility. Generally, the more streamlined and flexible model for data storage, the lower its performance in processing requests. This is due to the fact that at the request of the normalized structure usually requires much more action to merge the tables than in the case of optimized structures. Directing all queries to the user's data marts, while maintaining a flexible model for data warehouse designers can achieve flexibility and long-term stability of the structure of a data warehouse for optimum performance, handling user requests.





