I use Calc – the spreadsheet application in the suite – extensively in my daily activities to analyze data subsets in data warehousing projects and proved to be very helpful. But the drawback of using this application is that you have to use it as it is – no point to have much complaints. But now I’m facing the stability and reliability issue that almost drive me nuts – an unstopable recovery process.
If you have a document that is recognized as a damaged one by OpenOffice you will be shown a dialog to recover it. And if you agree to recover it, almost only few of the times it behaves nicely .
But if you are unlucky enough then this process will take forever without any clue to be stopped in hand. I face it now… and even restarting my system several times didn’t bring any good. Then I got this helpful page by googling and it relief me from the problem.
If you have the same issue, try this :
- If you are on Windows environment, delete Documents and settings/username/Application data/Openoffice.org2/user/registry/data/org/openoffice/Office/Recovery.xcu file.
- And if you are a Linux user, delete /home/username/.openoffice.org2/user/registry/data/org/openoffice/Office/Recovery.xcu
Till next article…
P05,Irian Jaya Barat
P15,Kepulauan Bangka Belitung
P20,Nusa Tenggara Barat
P21,Nusa Tenggara Timur
P33,Nanggroe Aceh Darussalam
Now here are the steps to create a transformation handling the csv files :
- To create a transformation you can do it with several ways :
- A new Transformation 1 workspace tab will show up, you rename it by saving our newly created transformation. Press CTRL + S and save file as c:contoh_kettlebaca_propinsi.ktr.
- Notice that now our tab is changed to baca_propinsi.
- Transformation will consist of several steps. Now we will drop a step visually in our workspace to read the content of ourC:contoh_kettlepropinsi.csv file.
- In the left panel open up Core Objects | Input category. Here you will find several steps to read input from several file formats.
- For our need, we will use CSV file input step.
- Click on CSV file input icon and drag it into our workspace.
- Double click on the [CSV file input] step until a dialog shows up.
- In Filename section fill in c:contoh_kettlepropinsi.csv. Left the other as they are now.
- Click on Get Fields to retrieve our known fields. Click OK and Close subsequently for another dialogs that pop up.
- You will a configuration with values shown like picture below. Click OK to return to our workspace.
- Save our transfomation file
- We can preview several records read from our step using preview facility in Spoon.
- To demonstrate this, click on [CSV file input] and click Preview icon in the toolbar then click on Quick Launch button that shows up.
- In seconds you will have Examine Preview Data dialog with a number data of records previewing in this window. Close it for now by clicking Close button.
- From the left panel open Core Objects | Scripting and drag Modified Java Script Value step into your transformation workspace.
- Hold CTRL key, click on both [CSV File Input] step and [Modified Java Script Value] then right click and choose New Hop. Click OK on dialog that show up.
- We just created a hop that bridging the two steps we created before.
- One of Modified Java Script Value step functionality is to change our data using programmatically using several built in operator and functions. If you know Java very well, you can also embedded Java code in this step. Double click on the step and type following code in the editor that appears.
- Make sure that Compatibility mode ? is unchecked
- Click on Get variables button.
- You will have a dialog look as below.
- Click OK.
- Evaluate this step by previewing data on it.
- Now we will dump our result from [Modified Java Script Value] to a text file, C:contoh_kettlepropinsi.txt.
- Again, from the left panel open Core Objects | Output and drag Text file output type step to workspace.
- Joining [Modified Java Script Value] and [Text file output] with a hop.
- Double click on [Text file output]
- In the pop up Text file output dialog click on file tab and type C:contoh_kettlepropinsi in Filename section.
- Still on the dialog, click on Fields tab and click Get Fields button to have 4 fields show up (Kode_Propinsi, Deskripsi,Deskripsi_lengkap, Text file output).
- Click OK.
- Now our transformation already has our goals : read a csv text file, change some value and put it into another 2 fields, and save the combining fields into a new csv text file.
- Run the transformation by click Run button on the toolbar.
- Click Launch on the dialog.
- You will be redirected to a log workspace with running steps detail information, for example how many rows that are read and written in the step. I will not going further by explaining parts of this workspace but please notice at the bottom panel where there are detailed logs output there. You see in the last lines that our transformation has been successfully executed.
- Now take a look at our C:contoh_kettle folder, we will have 1 more file there. A newly created propinsi.txt. Open the file with your favorite text editor and see the change from the original file.
P01;Bali ;Propinsi Bali;00000000000001.00
P02;Bengkulu ;Propinsi Bengkulu;00000000000002.00
P03;Banten ;Propinsi Banten;00000000000003.00
P04;Gorontalo ;Propinsi Gorontalo;00000000000004.00
P05;Irian Jaya Barat ;Propinsi Irian Jaya Barat;00000000000005.00
P06;Papua ;Propinsi Papua;00000000000006.00
P07;Jambi ;Propinsi Jambi;00000000000007.00
P08;Jawa Barat ;Propinsi Jawa Barat;00000000000008.00
P09;Jawa Tengah ;Propinsi Jawa Tengah;00000000000009.00
P10;Jawa Timur ;Propinsi Jawa Timur;00000000000010.00
P11;Kalimantan Barat ;Propinsi Kalimantan Barat;00000000000011.00
P12;Kalimantan Tengah ;Propinsi Kalimantan Tengah;00000000000012.00
P13;Kalimantan Timur ;Propinsi Kalimantan Timur;00000000000013.00
P14;Kalimantan Selatan ;Propinsi Kalimantan Selatan;00000000000014.00
P15;Kepulauan Bangka Belitung;Propinsi Kepulauan Bangka Belitung;00000000000015.00
P16;Kepulauan Riau ;Propinsi Kepulauan Riau;00000000000016.00
P17;Lampung ;Propinsi Lampung;00000000000017.00
P18;Maluku ;Propinsi Maluku;00000000000018.00
P19;Maluku Utara ;Propinsi Maluku Utara;00000000000019.00
P20;Nusa Tenggara Barat ;Propinsi Nusa Tenggara Barat;00000000000020.00
P21;Nusa Tenggara Timur ;Propinsi Nusa Tenggara Timur;00000000000021.00
P22;Riau ;Propinsi Riau;00000000000022.00
P23;Sulawesi Barat ;Propinsi Sulawesi Barat;00000000000023.00
P24;Sulawesi Tengah ;Propinsi Sulawesi Tengah;00000000000024.00
P25;Sulawesi Tenggara ;Propinsi Sulawesi Tenggara;00000000000025.00
P26;Sulawesi Selatan ;Propinsi Sulawesi Selatan;00000000000026.00
P27;Sulawesi Utara ;Propinsi Sulawesi Utara;00000000000027.00
P28;Sumatra Barat ;Propinsi Sumatra Barat;00000000000028.00
P29;Sumatra Selatan ;Propinsi Sumatra Selatan;00000000000029.00
P30;Sumatra Utara ;Propinsi Sumatra Utara;00000000000030.00
P31;DI Yogyakarta ;Propinsi DI Yogyakarta;00000000000031.00
P32;DKI Jakarta ;Propinsi DKI Jakarta;00000000000032.00
P33;Nanggroe Aceh Darussalam ;Propinsi Nanggroe Aceh Darussalam;00000000000033.00
Congratulations, you have just created a simple transfomation designed and executed in Spoon. For more reading on transformation you can check on Pentaho wiki. We will also frequently update our samples in Kettle section. So, always stay tuned
If you have any question at this article, feel free to drop us a note at firstname.lastname@example.org.
Last week at the BI forum I ran a quiz.
It was a a light hearted affair with specialist subject questions mixed with general knowledge.
The winning team overall was JEH, and the most obiee skilled team was the Dream Team. Top prizes included some Brighton rock and a pencil!!
I had to dig around in the documentation to find some of the questions, and found a few tough ones, so I was delighted that everyone did pretty well. If you were in a team you may want to see the results below!
Here are the questions. See how many you can get.
I may post the answers later!
1. What can’t you ‘manage’ in the Admin tool
2. Who thinks Coding in UDML is fun
a) Andreas Nobbman
b) Christian Berg
c) Mark Rittman
3. Which if these is NOT a view in OBIEE
c) Static Text
4. Which of these is NOT a table type in the physical layer.
a) Materialized View
b) Stored Procedure
c) Physical Table
5. What is this Icon used for?
a) See who has access to a page
b) See more details on a group
c) Add a column to a page
d) Add a user to a group
e) Add a page to a dashboard
6. What does the UDML statement “DECLARE ENTITY FOLDER” create
a) Presentation Catalog
b) Presentation Table
c) Physical Display Folder
d) Logical Display Folder
7. Which of the following is NOT valid UDML
a) DECLARE ATTRIBUTE
b) DECLARE FOLDER
c) DECLARE TABLE KEY
d) DECLARE COLUMN
8. Which of these is NOT a real utility
9. Which of the following is NOT a real setting in the privileges section of the web based admin
a) Analyze BI Publisher Reports
b) Access to RSS Feeds
c) Manage Privileges
d) Edit My Dashboard
e) Add/Edit Nested Request View
10. What year saw the release of Siebel Analytics 7.5
11. Which of these is NOT a real Oracle Function
12. Which of these is NOT a real Admin Tool Utility
a) Repository Documentation
b) Remove Unused Physical Objects
c) Aggressive Persistence Wizard
d) Oracle BI Event Tables
e) Rename Wizard
13. What setting do you add to instanceconfig.xml if you want group dashboards together in a single hyperlink
14. Which SQL statement is valid
a) CREATE OR REPLACE FUNCTION AW3(vInt IN PLS_INTEGER)
b) CREATE OR REPLACE MATERIALIZED VIEW AW2 AS SELECT * FROM AW1
c) CREATE OR UPDATE VIEW AW4(COL1) AS SELECT COL1 FROM AW1
d) CREATE OR REPLACE TABLE AW1(COL NUMBER)
15. What was the name of the original company that developed OBIEE
16. Which of the following functions is NOT valid
17. Which function returns the first item in the list that the user has permission to see
18. True or False: To make a dashboard visible to one group only, that group needs to be added to the repository
19. Which well known blogger has his own brewery
b) Adrian Ward
c) Jeff Mcquigg
d) John Minkjan
20. Which of these is a real professional LinkedIn group
a) Oracle Business Intelligence Group
b) Oracle BI Nerds Group
c) OBIEE Unlimited Group
d) OBIEE for Beginners Group
21. What is the maximum number of characters that display in a textbox
22. What is the default size parameter of the CHAR datatype in a Cast function
23. Which of these is NOT a real configuration file
24. Which of the following is a valid interval
25. In the instanceconfigl.xml there is a tag called DSN. In which file does this tag look for more details?
26. True or False: To use LDAP based Authorisation, you have to import the groups into the catalog
27. Which of these is NOT related to Security
a) Usage Tracking
d) Data Restrictions
28. Which of these is NOT a real icon?
a) Happy Face
29. Using an embedded URL, How do you include a dashboard from another catalog, without the top border (the one with the dashboard links in)
e) It just does it for you
30. At the BI Forum in 2009, Who said, ‘Don’t try this at home kids’
a) Christian Berg
b) John Minkjan
31. Which of these of NOT a real obiee object
b) Text Box
c) Edit box
d) Speech Bubble
32. Which is the lowest logging level that will include the actual SQL sent to the database
33. Which of these is NOT a real function in OBIEE?
34. What Query Restrictions are NOT available on User Permission settings
a) Time Restriction
b) Max Rows
c) Max Minutes
d) Max tables
35. In which config file do you set the path for your java files that called in an iBot?
a) instanceconfig.xml (in OracleBIDatawebconfig)
e) instanceconfig.xml (in the other place!)
36. Which of these is NOT a real Oracle Function
37. What Command is used to embed a dashboard Page into another web page?
38. Who won the Americas cup in 2004
c) Larry Elison
39. On February 25th, From the list below, who has the highest number of points on the OBIEE forum on OTN
a) Christian Berg
b) Gerard Nico
c) Jon Mead
d) Goran O
e) Phil Henson
40. How many arguments does the OBIEE function INSERT have?
41. What is the standard Port number used in 10g for the javahost?
42. Which of the following is NOT a report link
43. Which ‘Operator’ is NOT available in a ‘Column Filter Prompt’
a) is less than
b) is in top
c) begins with
d) contains all
e) is LIKE (pattern match)
And the results….
Article source: http://www.biblogs.com/2010/05/25/the-2010-bi-forum-quiz/
In a previous part of my notes on Realtime Data Warehousing I mentioned some of the challenges of reducing latency. The piece picked up quite a few comments – to which I say thanks to all that posted responses. One of the comments from Matt Hosking mentioned some of the points I was to raise in this posting.
If you ask someone on the outside of developing a (near) realtime data warehouse what the greatest challenge will be they probably would say “capturing the change” since they know we can already “do” data warehouses. I think that is wrong, capturing change is easy; the big problem is applying that change in a timely fashion to a data warehouse that also remains available for query. Adding relatively few rows of new fact to a table is trivial compared to the actions needed to validate, transform, apply keys, index, and publish the fact; and then think about the impact of merging that new fact into existing aggregate tables or materialized views. A lot of moving parts, a lot of challenge.
Realistically, we could populate an “atomic data store style” layer in realtime with what is in effect a versioned (timestamped, journalized or however you term it) replica of the source, a replica which is probably suited for realtime reporting but what we don’t get are the features of a data warehouse that we come to expect in a traditional star schema DW. We possibly miss out on: data validation through the ETL process, data enrichment and derived measures, conformed dimensions, slowly changing dimensions (especially type 2 SCD) through surrogate keys. It may well be that you don’t actually need a star model, after all one of the viable DW models for an Exadata warehouse is just that; a bunch of conventional tables joined on the natural business keys.
Another point to consider is that it is quite unlikely that all of the fact domains in a data warehouse need to be realtime ones; for example data sourced from a supplier’s EDI feed may arrive far less frequently than, say, sales transactions from the company’s web-store. Obviously, if we have realtime feed of sales, we must ensure we have all of the dimensional (reference) data loaded before a new transaction arrives, or else develop robust ways to handle this. This is a situation where we need business knowledge; if a new customer can be created at time of purchase (as often is the case for a web sale) we will need a realtime customer feed along with the realtime sales feed, but for banks with strict money laundering regulations customers are registered way before transactions occur, so a timely load of customer is likely to be sufficient.
Not only is it unlikely that all data feeds to a data warehouse need be realtime, it quite likely for some “facts” that only some measures are realtime measures. Consider sales: we know the quantity and the price charged to the customer at the time of the sale, but we may well not know the cost of goods until the time the order is fulfilled.