October 17th, 2011 | Categories: Linux | Tags: ,

GNU/Linux includes many utilities for working with text files through the shell. In this post we take a quick look at accessing and manipulating text files in a “column-wise” mode.

Suppose you have the following two files, each with two columns separated by the TAB character.

$cat file1
Alice   Paris
Bob     Tokyo
Mary    London
John    New York

$cat file2
13 May    Orange
19 Oct    Blue
11 Nov    Black
29 Feb    Red

The data in the two files are in fact related, i.e. file2 contains the date of birth and favourite colour of the people mentioned in file1 (assuming also that the files are sorted correctly). It would make sense to combine the two files together so that each row has the full data for each person. The paste command does just that.

$paste file1 file2 > file3
$cat file3
Alice   Paris     13 May    Orange
Bob     Tokyo     19 Oct    Blue
Mary    London    11 Nov    Black
John    New York  29 Feb    Red

Suppose that we are only interested in the name and date of birth of each person, and we can discard the hometown and favourite colour information. The cut command is what we shall use:

$cut file3 -f 1,3 > file4
$cat file4
Alice   13 May
Bob     19 Oct
Mary    11 Nov
John    29 Feb

Our next and final requirement is to reorder the columns differently. Instead of having the name followed by date of birth, suppose we want to have the columns the other way round. Unfortunately cat -f 3,1 produces exactly the same output as cut -f 1,3, so the cut command will not be sufficient. We have to use sed instead.

$sed -e 's/\([^\t]*\)\t\([^\t]*\)/\2\t\1/' file4 > file5
$cat file5
13 May    Alice
19 Oct    Bob
11 Nov    Mary
29 Feb    John

How does that work? Well \([^\t]*\) is a “named expression” which matches all characters except TAB. The search pattern looks for two of them, separated by TAB (\t). In the replace-with part, they are referred to as \2 and \1, again separated by \t.

Of course if file5 was what we ultimately wanted from the beginning as our output, we could have simply piped commands together:

$paste file1 file2 | cut -f 1,3 | sed -e 's/\([^\t]*\)\t\([^\t]*\)/\2\t\1/' > file5

or alternatively

$paste file1 file2 | sed -e 's/\([^\t]*\)\t\([^\t]*\)\t\([^\t]*\)\t\([^\t]*\)/\3\t\1/' > file5

October 17th, 2011 | Categories: Linux | Tags: ,

Sed can be used to strip out all HTML or XML tags from a file and get the plain text version. Suppose you have file gnulinux.html with the following contents:


<p>The combination of <a href=“/gnu/linux-and-gnu.html“>GNU and Linux</a> is the <strong>GNU/Linux operating system</strong>, now used by millions and sometimes incorrectly called simply “Linux“.</p>

Tempting but incorrect – sed finds the longest possible match which in this case is the entire file, and thus will output nothing:

$sed -e 's/<.*>//g' gnulinux.html
 

Correct version:

$sed -e 's/<[^>]*>//g' gnulinux.html
The combination of GNU and Linux is the GNU/Linux operating system, now used by millions and sometimes incorrectly called simply “Linux“.

August 16th, 2011 | Categories: Linux | Tags: , , ,

If you use TrueCrypt under Ubuntu 11.04 Natty you would have noticed an annoying behaviour. Under previous versions an icon is present in the system tray which remains there whilst a volume is mounted even if the TrueCrypt window is closed. Under Unity the tray icon is not shown. If you accidentally close the window (instead of minimizing it), there’s no easy way of getting back to it. Launching TrueCrypt again results in a error message.

The fix as suggested on shocm.com is to set the systray-whitelist property by running the following through command line (you will need to restart afterwards):

gsettings set com.canonical.Unity.Panel systray-whitelist "['all']"

However, your mileage may vary, and on my netbook this did not fix the issue. It seemed like the icon was placed in the tray, but it was rendered as a very thin strip about 1-2 pixels wide which could not be clicked.

The only way I found of getting out of missing window scenario was to resort to using TrueCrypt through the command line to dismount all mounted volumes.

truecrypt /d

This will also exist TrueCrypt and afterwards it can be launched as usual.

August 11th, 2011 | Categories: Business Intelligence | Tags: , ,

Sometimes it may be useful to know inside the report which servlet mapping is used to run the report (e.g. frameset, run, preview).


var sMapping = reportContext.getHttpServletRequest.getRequestURI();
var nPos = sMapping.lastIndexOf( "/" ) + 1;
this.text = sMapping.substring( nPos );

You can place the above code inside a label to see the servlet mapping.

June 28th, 2011 | Categories: Business Intelligence | Tags:

BIRT 3.7, previously codenamed Indigo, was released on June 22. Here are the necessary links to find out more about this latest version:

June 9th, 2011 | Categories: Business Intelligence | Tags: , ,

In an earlier post we discussed a generic way to dynamically get the list of report parameters at run time without having to hardcode the parameter names in the script. As was correctly pointed out in the comments, the code did not work for parameter groups (and also cascading parameters). However, it is possible to modify the code to get it working even for those cases, and the change is not a huge one.

The main difference is we will use the method getAllParameters() instead of getParameters() of ReportDesignHandle. It seems that getParameters returns only the “top level” parameters, whereas getAllParameters flattens the groups and the nested as well as top level parameters. One thing to watch out for is that the return types of the two functions are different: java.util.List (getAllParameters) vs. org.eclipse.birt.report.model.api.SlotHandle (getParameters). Also, we have included a check for the class of the parameter entry in the list, in order to exclude the groups themselves. You can comment out the if statement in case you wish to see the group names as well.

Here is the updated working code:


var sOutput = "";
var parameterArray = reportContext.getDesignHandle().getAllParameters();
var parameterCount = parameterArray.size();
for( var i = 0; i < parameterCount; i++ )
{
  var sParClass = parameterArray.get( i ).getClass().toString();
  if( sParClass == "class org.eclipse.birt.report.model.api.ScalarParameterHandle" )
  {
    var sParName = parameterArray.get( i ).getFullName();
    var sParVal = reportContext.getParameterValue( sParName );
    sOutput = sOutput + sParName + " = " + sParVal + "\n";
  }
}
this.content = sOutput;

June 3rd, 2011 | Categories: Business Intelligence | Tags: ,

Format Chart tab of the Edit Chart wizardOn this blog we have often talked about overriding the default behaviour of the BIRT chart palette by dynamically setting colours based on data values. However, it is perhaps worth looking in more detail at the palette itself to understand the default behaviour and appreciate how it can be tweaked.

Each chart has a palette. The palette determines the colours used for drawing the series in the chart. From the UI perspective, the palette options are accessed on the last tab of the Edit Chart dialog (Format Chart), the first selection in the tree on left-hand-side (Series). It is worth noting that a chart only has one palette, even if the chart contains multiple series.

BIRT chart series paletteBy default there are 32 entries in the palette. They are used in the chart from top to bottom, i.e. in the default scenario the first category will be coloured blue, the next one red and so on. The Add and Remove buttons in the Series Palette dialog can be used to change the number of entries. What happens when the chart contains more categories than the list of palette entries? The answer is that the palette will wrap around, i.e. the 33rd category will be coloured in the same way as the first one. If the number of categories is known in advance, one can extend the palette using the Add button to create the necessary number of palette entries. (Naturally, one should question the design of such a chart, will the user really be able to extract any meaning from a chart with 32 or more different colours? In most cases, probably not.)

Each palette entry itself can be edited. Not only is it possible to set one of the predefined colours, onBIRT chart palette editing a palette entrye can also set a custom colour, a linear gradient based on two colours or even an image file. It is also possible to specify different colour for positive and negative values.

As we already mentioned, the palette colours are applied top to bottom without any further logic, in particular they are not tied to the actual data point values themselves. Several common scenarios and scripting approaches for solving the issue have already been discussed on this blog: Setting BIRT chart series palette dynamically, Setting BIRT chart series palette dynamically, part 2 – area charts, Setting BIRT chart series palette dynamically, part 3 – stacked bar charts.

In the second part of this article we will look at how the palette and related objects can be manipulated programmatically, i.e. through scripting.

April 18th, 2011 | Categories: Web | Tags: ,

Description in a Firefox Bookmark

By default Firefox populates the Description field of a bookmark with the content of the page’s <meta name="Description" content="..." /> tag. I find is a little annoying and was looking for a way to remove these descriptions.

Since I have a lot of bookmarks I wanted to clean them all up in one go rather than editing them one by one. Luckily there is an easy way to do this since Firefox stores the bookmarks in an SQLite database file which can be opened and manipulated.

  1. Install SQLite Database Browser (Ubuntu: sudo apt-get install sqlitebrowser Windows: download and install from SourceForge).
  2. IMPORTANT! Export your Firefox bookmarks to have a backup in case something goes wrong.
  3. Close Firefox (you can’t edit the bookmarks file while Firefox is running).
  4. Run SQLite Database Browser and open the places.sqlite file which is located in your profile folder.
  5. Run the following SQL which will remove the description from all bookmarks.

    delete from moz_items_annos where anno_attribute_id = 2
     

     

  6. Save and close, run Firefox.

Reference on the places.sqlite file can be found on mozillaZine and Mozilla Developer Network.

December 29th, 2010 | Categories: Linux | Tags: , , ,

After finally getting some time to fully read up on Mono (especially on the excellent The Source) I have decided it is best to remove it from my system. The Open Sourcerer has a nicely written up set of instructions for 10.04 Lucid Lynx and 10.10 Maverick Meerkat.

December 29th, 2010 | Categories: Business Intelligence | Tags: , ,

In case you missed these, Innovent Solutions (one of the corporate backers of Eclipse BIRT project) published papers comparing Jasper and Pentaho to BIRT. Naturally, they conclude that BIRT is top of the pile in open source reporting, however they do justify this conclusion and the papers make an interesting read.