The end of relational database management systems

August 26, 2009

This short, and high-speed-to-read article at ACM offers a good list of arguments against the common relational DBMS world most of us are used to live in.

For the world of data warehousing the predicted solution is column based databases. At least databases, but not relational any more. But how do such column based solutions work? What is the linking element? IMHO there always needs to be a identifying key, and the associated value within the very column. All this optimized, compressed and stored in a bitmap like index. Additional you need information about how to join column information together. Do it with the original keys? Do it with sort of row ID pointing to the right line?Any other solutions? Ask google!

For me it all seems like old wine in new skins. At the end one need a high performing machine which does good hash calculation to build up and resolve indeces. In memory technology could spice up the query life. Seting up a data vault model is something in between, and I suppose a good way to have both a good and queryable database on one and good ad-hoc peformance on the other side.

A sugested further reading on column based databases is: Michael Stonebraker et al., “C-Store: A Column-oriented DBMS,” Proc 2005 VLDB Conference, Trondheim, Norway, Sept. 2005 (pdf).


twitter? blogs? in companies?

August 19, 2009

Recently I dived into the realm of twitter, and guess what, I’m still trapped there. On one hand it is fun to follow people and espacially friends on their actions, ideas, or hints, and keep them updated in a quick and unformal manner about what is happening im my life. But one the other hand I see some sort of potential for everyones professional life, and that is what I would like to talk about for now. At first I thought about a explanation about the usage of twitter on a personal level, but there are other sources doing this better, ask google. But, …

Establish a twitter server (would be nice to have this available, ‘am not sure about) within your company, into the DMZ if you like. Set up a wordpress server too. Imagine yourself as part of the BI team of your company. You may set up different blogs on Data Quality, Master Data Management, Enterprise Data Warehousing, or SOA activities. Create some twitter users, and link both the tools. Let your users from the business department follow various channels. It’s up to them to decide what if of interest to them.

Useful tweets could be about the general system status, or major malfunctions. More special tweets could be about the status of pre calculation engines, or the overall situation of your entrance layer / operational data store. Update your users about freshly released queries, add them as tiny URL into the tweet. What about KQI, your key quality indicators? And finaly you could enrich your internal news stream with external information. The intranet pages will be of certain use, but as news stream twitter could be more focused, user oriented.


query definition without BEx

August 11, 2009

Via twitter.com/sapbwtweet and SDN I came across a not yet valued but evaluated tool to get a faster insight into your query definition without utilizing BEx and facing long waiting times.

  • Enter transaction se38
  • Execute programm RSRQ_QUERYDEFINITION
  • Enter a query name (known selection dialog from BEx)
  • Select / de-select some option
  • Execute
  • Analyze your query

In my opinion the structure of a query and all its elements are displayed in a transparent way without missing details. This transaction yould be used to analyze a query regarding filter values and the usage of variables much quicker than by see the real definition in BEx front end.


ny times on statistics

August 8, 2009

Thanks to twitter I came across a interesting article at the online edition of the New York Times (here). It is about the problem of analyzing huge amounts of data and detect patterns within. The big companys a hiring more and more statisticians to develop numer crunching methods. The acquisition of SPSS by IBM outlines this tendency. Go for analysis, but do not forget reporting!


farewell scrum, welcome … what?

July 31, 2009

Weeks ago we (which is my team and I) stopped what I labeled scrum before. The old postings about this experiment could be found here. For today I would like to reflect the latest stage and what comes next.

Last Scrum Board impression

Figure 1: Last Scrum Board impression

Figure 1 shows the last impression of our board. Most things are done with the release, somethings are left. In the upper left corner one could get an idea of the new usage of this expensive white board. I started nearly a year ago with the idea of utilizing different aspects of scrum, which were:

  • Daily stand-up meeting
  • Smaller iterations compared to our locale release cycle of 2 months
  • Tracking “things to be done” on a Board
  • Use a priorized Product Backlog owned by the Business Department (and me)
  • Review the release

My motivation behind was a) building up a team with a real spirit of cooperation and implementing solutions together and b) speeding up the development, short: Do more in less time. Things are working fine. IMHO we all performed much better in this time, we implemented more than originally planned per iteration, the dialogue between Business Department (BD) and me was as good as never before. The backlog was used by both parties to negotiate sprint content etc. But we failed. Why?

First of all I was not able to split “requests” from BD into such small pieces that a visual tracking was usefull. We commited us internally to a 3 week sprint duration. The common duration of a task was 5 days. Additional there is the need for solving incidents and thinks like. So, each developer is only able to implement 2 tasks per sprint. Given a team of 5 developers we had 10 story cards on board which remained for a long time in the “In development” slot and for a relativly short time in the “Test” slot. We met every day to talk about the one card. This was getting boring by the time. Then we experimented with a planing round. Which was a total flop. We only did once. What else? Nothing. What will change? A lot.

The board will still be used as so called information radiator, but with a different aproach: It is now there to sketch ideas of solutions, to visualize thougts, outlining architectures, data models, relations etc. I’ll move back to a weekly status meeting. This is than dedicated to talk about major issues, but also solutions, or topics from the other teams. Barriers are solved directly within the team. The Product Backlog will remain, because it is working too good. I’ll also try to have something new in production within every 3 weeks. Just to offer good service to BD. But I’ll cut the limit and will move towards a more continuous enhancement of the productive environment. I’ll do the release planing, which is necessary in the department, and also the estimation of the single entries in the backlog on my own. Assignment of tasks to single developers is also done directly by myself. So this is what changes most: The controling.

Main lesson learned: It is about the people! In this case about the team members, the developers. There is a need for a specific charateristic, some self motivation, the drive, to pick the next task on your own, to have enough confidence into your own decissions, to put commitments into your statements etc. Agile methods need specific people. Without them all the tools are nice, with them the tools produce added value.


TPS

July 31, 2009

Thanks to a posting at codemonkeyism.com I bought Taiichi Ohnos book about the Toyota Production System. Originally published in 1978 it explains the way he changed the production system at Toyota into what is now known as The Toyota Way. Concepts like Just-In-Time (JIT), or kanban are explained.

With this book I try to clarify for myself whether Data Warehousing is production or product development (TPDS is the Toyota acronym for this). Am I part of the software development guild or not?


Data modelling

July 19, 2009

Recently I came to the conclusion that data modelling becomes important again. And I mean not the technical aspects, but as useful part of the documentation or request management. As not every company or person is able to hold ERwin licences I would like to introduce some of the free tools. The main source for this are the posting of Rajan Chandras about Data modelling on intelligent enterprise.

Additional there is a heise ticker posting about an accidentally labled as free modelling tool released by Oracle. The licence cost is about $3000, anual fee $660.

MySQL Workbench

The community and thus free version of this tool could be downloaded at the MySQL site. There are a few features only available with the standard edition (I assume that this is what SE stands for):

  • Printing / PDF
  • Reverse & Forward engineering

First one is bad, second one IMHO not so important when working in Data Warehouse environments where droping and recreating data models is not so common. The tool itself is oriented to the physical data model. There is no chance to set up a real ER diagram and re-model it into a physical representation. If you try to define a n:m relation between maybe Business Partners and Accounts (see fig. 1) a relation table is created automatically (see fig. 2).

Figure 1: Single entities

Figure 1: Single entities

Figure 2: n:m relation added

Figure 2: n:m relation added

The third entity could be added easily by defining a new 1:n identifying relation between BP_has_Accounts and the roles. Nothing new.

The UI is clear and well structured. What may help is the first tab containing sort of catalog of available documents, schemata and scripts. As a free tool it could help IT people to put structure into ideas, or to visualise future cube design and options for drill paths and other navigation. Due to its technical focus and the missing layer of abstraction / business perspective it is not useful for documentation.


Visualization

July 12, 2009

Embedded in this posting you find a reference to Röyksopp’s Remind Me video.

Beside the fact that I like this kind of music and the pixel style of the video from time to time it shows some great ways to visualize KPI’s (2:30 – 2:50, 3:25 – 3:40, and 3:50 at least). It’s not the fact that pie charts or bars are used but the embedding of the charts in a more explaining context.

I do not know if this is a possible future or will lead to more acceptance. Fact is that this high end visulization is 100% against the tendency of setting up autarcic information consumers in the business departments. But BI may be beyond company internal departments, and thus, such representation of key figures could be usefull.

The video itself was produced by H5, a french based company. On their web site one will find other IMHO outstanding videos.

(via: de-bug)


Agile Warehousing – Automated testing

July 10, 2009

Today we had an internal presentation of one of our trainees who implemented a idea of mine into a prototype: Automated testing in SAP BW. The original idea is the following: The system used for developer tests shall provide additional functionality to flip a swith and let the system perform tests automatically. The green light shall indicate that a specific solution is still working, so the application development could go ahead. Read the rest of this entry »


SAP BW and Spatial analysis

June 25, 2009

What I learned yesterday and this morning is the usage of maps or other shape based graphics to assign KPI to spatial information like country, region, or places in my factory. To do so one need shape information and assign them to special InfoObjects with geographical features (maintained in rsd1). In BEx Web Application Designer there is a Map web item which displays such data on maps. I could post details on demand. What I want to let you know is that there could be several layers in one map. Each layer is exaclty one DataProvider assigned, and all KPI from that DataProvider are displayed. So if there is the need to display KPI in different manners (like coulored shapes for one and some pie diagram for others) you need to do the following:

  1. Set up the query (something maybe done before) containing all relevant KPI
  2. Create different query views, each containing interesting groups of KPI
  3. Create DataProvider for each query view
  4. Cretae map layer for each query view, assing the DataProvider and be creative with all the parameters
  5. Save and execute the Web Template
  6. Be happy!