WHAT IS DATA WAREHOUSING

AND WHY DO WE NEED IT? ...

 

The use of Executive Information Systems (EIS), Decision Support Systems (DSS), Data Mining systems and standard query systems, collectively referred to herein as Business Intelligence Systems, has exploded in recent years with the advent of powerful desktop and mainframe computer hardware and software.

Corporations of all sizes are embracing these technologies to assist in their decision making processes and to remain competitive in a far tougher business world.

What is now clearly understood and accepted is that the successful implementation of a Business Intelligence System requires a different strategy and technology than that utilized for operational systems. This strategy and technology has become known generically as “data warehousing”.

A data warehouse is not dissimilar, in a philosophical sense, to a product storage or distribution warehouse. It is simply a place where data can be gathered, organized in an orderly fashion and made available for easy access when required. It is the storage area that facilitates easy locating of the required goods when an order needs to be picked for delivery to the customer; that is, when data is required by an end-user.

However there are, like product storage warehouses, well designed data warehouses and poorly designed and inflexible data warehouses. Therefore an appreciation of the need for and the requirements of a data warehouse are critical for a successful installation.

There are two requirements of computer data in any business. The first is an operational requirement to facilitate the processing of business transactions. The second, and arguably the more important use of data, is the need to analyze the results these business transactions deliver, or could deliver, if they were better understood and utilized. In other words, there is an operational use and an informational use of data.

Unless a flexible and active data warehouse is installed first, the task of implementing and maintaining a Business Intelligence System can be difficult, costly and time consuming, and in the end will probably not deliver the results required.

This executive overview of RODIN Data Warehousing will hopefully de-mystify some of this jargon riddled technology and alert you to some of the problems and considerations you will need to address to successfully implement a Business Intelligence System using a data warehouse.
 

DATA CAN BE COMPLEX,

CONFUSING AND

DIFFICULT TO ACCESS ...

 

Computers store operational data such as sales invoices, payroll records, and manufacturing processes in a way that is efficient for computer programs to process business transactions. This is usually in a structure called a "normalized relational database".

While this is considered the most efficient way to process this data from an operational sense, it is usually very difficult for non technical people to understand and use data stored this way for informational purposes.

Furthermore, the data required to facilitate a business operation is also quite different to the data required for informational purposes. For example, a sales order transaction often contains data associated with how goods are to be delivered to the customer. This type of data is usually of little or no interest 12 months later when you wish to analyze sales performance or trends.

Data to be used for Business Intelligence Systems needs to be organized and stored in an entirely different way. While the data required for operational systems is usually organized and stored to suit a business process such as taking and filling a sales order, in Business Intelligence Systems data is generally required to be organized by subject such as customer or product.

Often this subject information comes from different sources and even different computer systems, both internal to your organization as well as external. For example, the product information required may need to come from your sales system, your manufacturing system, your inventory system and an external market share system.

For all these reasons, using operational data for Business Intelligence Systems is usually complex, confusing and difficult to access. It usually means either you can't get the information you require easily, or enormous computer and human resources are being wasted in designing, coding, testing and maintaining these systems. At the very least, using operational data means a high level of technical expertise is required to create and manage these systems. Such technical expertise is usually in short supply and in high demand to keep the operational systems going, often leading to delays in getting information into the hands of the people who need it.
 

INFORMATION NEEDS

TO BE PRESENTED

IN DIFFERENT WAYS ...

 

Different forms of output are required to present information effectively. Sometimes a one page spreadsheet or graph of summarized data is necessary and sometimes a detailed printout of comprehensive historical data showing periodic and comparative values is required. In recent times we have been making greater use of more sophisticated desktop tools to view and manipulate data in more meaningful and intuitive ways. “Data mining” technology is the latest breakthrough in visualizing data, and is changing the way we can work with and exploit data, arguably the most valuable asset a business has. Each of these different data visualization technologies requires different forms of data.

Transforming operational data into a data warehouse does not, on its own, provide all the functionality required by these different types of data visualization tools. “Data marts”, or subsets of the data warehouse, are being used more and more to help facilitate this manipulation and visualization of the data on the desktop. In the distribution warehouse analogy, the data marts are the vehicles that deliver the goods from the warehouse to the customer.

Data marts can be constructed straight from the business applications or from a data warehouse. However, unless they are constructed from a data warehouse they become independent “islands” of disparate data. Contrary to what was originally thought, the independent data mart approach has proven to be time consuming, extremely difficult to implement and costly to maintain. It often also delivers inconsistent results and ultimately leads to failure in all but the smallest of organizations.

It is not, as some would have you believe, a choice between a data warehouse and data marts. To be successful in the medium to long term, both are essential components of a durable Business Intelligence System. Unless different forms of data visualization can be delivered, via purpose built data marts constructed from a data warehouse, many of the benefits of Business Intelligence Systems are lost.

The main reason for this can be illustrated in the following example. When data is loaded into 5 different data marts for different departments or different visualization tools directly from the business applications, it requires the source system to be defined 5 times even though it will be same in each instance. Furthermore, the loading rules and business rules need to be specified 5 times even though they may be or should be the same; the computer programs that create them need to be written 5 times even though they are very similar; and then when something changes, there are now 5 things to change. When using a data warehouse this is done once and only once, and the data marts can then be constructed very rapidly and consistently from there.

Another concern in this area is the use of proprietary "closed" data marts that can only be used with their integrated presentation tools. For example, there are many products that provide both a data mart build or loading facility (often misrepresented as a data warehouse), as well as integrated presentation tools. However, often these presentation tools can only be used on their own data mart and conversely, the data mart can only be accessed via their own presentation tools. This frequently leads to the need to implement multiple data marts when multiple presentation tools are required or preferred, and this is probably the worst possible position in which to find your organization.

A final major problem in this area is "How many versions of the truth are there ?". Unless the data being used to present the information in different ways is coming from the one source, i.e. from a data warehouse, the potential for multiple versions of the truth to occur is high. This means different people in your organization believing different facts, again a fairly disastrous position in which to find your organization.
 

DATA CAN BE LARGE,

TIME CONSUMING TO

PROCESS, RESOURCE

HUNGRY AND EXPENSIVE ...

Business Intelligence Systems are driven by data. Usually, the more data you have, the better the information that can be produced.

However it is not easy for desktop personal computers, those most often used for the visualization of data, to store and process large volumes of data. They simply don't have the storage capacity nor the processing power to handle the volume of transaction data usually involved in a business. Large volumes of data can quickly consume, and render useless, Business Intelligence Systems that are not designed to handle such volumes.

This is the main reason why most presentation tools only use summary data or data marts to process and present the required output, and why most rely on a "server" machine to provide the storage capacity and processing power needed to manage the data warehouse environment, which includes the data marts.

This data warehouse environment usually requires a great deal of technical expertise to set up, and then to maintain on an ongoing basis, unless a flexible and dynamic data warehouse management system is implemented. Such a system should include the ability to create and maintain the data warehouse from the source applications, as well as the ability to create and maintain the data marts for the desktop visualization tools, without the need to write computer programs. They should also include the ability to facilitate changes in the source systems data (e.g. when you install new or upgraded business software) as well as changes or additions to business “rules” (e.g. net sales value equals ...), business “dimensions” (e.g. territories, product groups) and business “measurements” (e.g. quantity, value, cost), again without writing or changing computer programs.

Business Intelligence Systems by their very nature are highly volatile and need to be flexible, dynamic and responsive. Traditional computer software development tools generally do not provide the degree of flexibility and rapid application development capability usually required to create and maintain these systems. Using traditional computer programming techniques therefore generally leads to failure in the Business Intelligence Systems world. It becomes too expensive, labor intensive and too slow to deliver the fast changes required for the information needs of the business. This is why a purpose built data warehouse management tool, one that requires little or no programming by your technical Information Systems staff, is a critical success factor in implementing Business Intelligence Systems.

From a hardware technology point of view there are specialized computers, often referred to as "servers", that are far better equipped to handle the data warehouse environment at a significantly lower cost than transaction processing computers. The use of these specialized server machines for Business Intelligence Systems can significantly improve your total computer throughput in both the data warehouse environment and your operational environment and therefore reduce the overall cost of computing in your business.

Therefore, the use of the right software and hardware technologies is a critical consideration when implementing Business Intelligence Systems. Failure to invest in these technologies will ultimately lead to failure.
 

LACK OF INFORMATION

CAN CAUSE LOSS OF

COMPETITIVENESS AND

MISSED OPPORTUNITIES ...

 

Computers have promised great benefits to business for many years but have not fully delivered on that promise.

Without computers handling business transactions we certainly would not be able to deliver the customer service levels we are and would have a higher cost structure to support the transaction processing needs of the business.

However, the data that results from these business transactions is probably the most under-utilized asset in any business. It not only tells us about profitability, sales performance, product demand and other valuable insights into the business, it can also tell us a lot about customer and product buying habits and trends. In other words it can not only answer questions we ask of it, but it can also pose questions we never thought to ask.

Apart from the ever improving information presentation tools that are now available to us, the most exciting development in Business Intelligence Systems in recent years is the ability to ask computers to tell us things about the data we don't know or didn't think to ask about. This is a technology referred to as "data mining" and promises to deliver the quantum leap needed to fully exploit the data asset we all have as an automatic by-product of our transaction processing systems.

Imagine the value of finding out things about your customers and products that could help you add a further margin to your sales performance, reduce your marketing expenses without reducing sales, or help you gain a further competitive edge in the market place. Imagine the value of starting to understand why and when your customers buy your products and services and, just as importantly, why and when they don’t.

Data warehouses provide the ability to analyze what is happening in a business. However in many businesses it is no longer good enough to know “what”. It is now also critical to understand “why and when” things happen. Business Intelligence Systems based on a well designed data warehouse can start to help you understand the why and when.

Whatever the requirements are, Business Intelligence Systems remain a key strategy to gain and maintain a competitive edge in today's market place. Whether it involves being able to predict when a customer will need a product, when a customer or product is not performing, or what opportunities there are for taking a better margin, Business Intelligence Systems are an essential component of any successful business.

The implementation of a successful Business Intelligence System relies heavily on the implementation of a flexible and dynamic data warehouse. Without a flexible data warehouse management tool, Business Intelligence Systems will take longer and cost more to develop and will open up the risk of not delivering to your organization the competitive advantage it needs to survive and prosper in today's business world.

It is no longer a matter of “Can I afford to do it ?” but rather “Can I afford not to ?”.
 

THERE IS A FLEXIBLE

AND DYNAMIC

DATA WAREHOUSE TO

HELP YOU REMAIN

COMPETITIVE ...

 

RODIN Data Warehousing was designed to take away many of the repetitive and high activity needs of installing a data warehouse, particularly those usually requiring a high level of technical expertise that is better utilized in addressing the operational needs of the business.

RODIN Data Warehousing allows you to take data from any source and store it for any length of time in any form you require. Once stored in the RODIN Data Warehouse you can then define the output you require quickly and easily and without the need for technical assistance. This allows you to produce hard copy reports, interactive inquiries, and/or data marts (i.e. summary files) for input into your Business Intelligence Systems.

Unlike many other data warehouse and data mart solutions however, RODIN Data Warehousing is an "open" data warehouse. You can directly access the data warehouse and the data marts it can create using literally thousands of standard query language tools, EIS tools and DSS tools, as well as specialized or purpose written programs if required. It uses a standard open database technology that can be accessed from just about any software product, tool or computer available.

Easy Access to Data ...

Rather than addressing data by its computer structures such as files and fields and key relationships, you access data in RODIN by high level definitions. For example you access Net Sales Value instead of selecting the SALESORD file and then building up an expression of fields like LIQTY x PRICE -LIDISC. Using RODIN all the complexity of the underlying database is removed from the end-users so they only deal with easy to understand business entities.

A further example of the ease of access is users don't even need to join files or tables together to get customer names or product descriptions. RODIN also does that for you automatically.

Large Capacity ...

RODIN has the capacity to rapidly store and process very large amounts of data, in the thousands of gigabytes range (i.e. terabytes). In excess of 15 million business transactions can be loaded into and extracted from the data warehouse every hour on a single server machine. Up to 32 separate server machines can be joined together to take this figure above 500 million per hour, offering unparalleled scalability.

Furthermore, the definition of business dimensions, measures, the level of detail (i.e. the “granularity”) and the time retention of the data to be stored are all completely user definable.

Flexible Output ...

RODIN allows you to design your output requirements using a highly intuitive graphical user interface. The output design can then be processed against the data warehouse to produce hard copy reports, interactive inquiries, and/or data marts for input to Business Intelligence Systems.

A single click of the mouse is all that is required to direct a RODIN report to a desktop spreadsheet, or an EIS or DSS application. What could be easier?
Use Any Desktop Presentation Tools ...

Rather than paying for and learning new spreadsheet, EIS or DSS tools, RODIN allows you to direct output to your existing or preferred desktop systems. This means your existing investment in these tools is fully protected and you are free to choose others in the future, as and when the need arises, without having to re-implement or re-design the data warehouse to suit the visualization tools you use.

Full Integration ...

Because RODIN can extract data from any source, store it centrally, and process it to produce output in any form you require, it provides the integration required between your transaction processing systems and your information systems.

It gives you the power to be creative in transforming your business data into vital, timely and relevant information so critical to being competitive in today’s business world.

Low Technical Involvement ...

Setting up a data warehouse can usually be done by your own technical staff. However it will require a large amount of their time unless they use automated tools. While installing RODIN Data Warehousing will involve their input in the design and set-up phase, it will not involve them writing programs to store and manage the data in the warehouse, a very time consuming and error-prone part of the implementation process.

Once the data warehouse is designed and implemented, RODIN is designed to be used by end-user staff who know what they want but who don't have the necessary technical expertise to get it.

Now departmental staff can assist and relieve your overworked technical staff of the burden of writing computer programs to extract and analyze data. More importantly they can get the information the way they want it, when they want it.

Advanced Server Technology ...

RODIN Data Warehousing uses the very latest and most advanced technology available today; the IBM AS/400 Advanced Server.

This open architecture, designed specifically for information processing systems, is considered by most leading independent consultants as highly competitive and appropriate for such applications.

With its traditional strengths - namely low cost of ownership, ease of use and fully integrated operations, combined with the high performance, scalability and openness of the Advanced Server models - the IBM AS/400, running RODIN Data Warehousing, delivers the best possible solution to implement the Business Intelligence Systems you need today and will continue to need in the future.