I have received a new requirement from one of my old colleagues, the request is:
The sample data is listed below:

image
The requirement is:
1. Merge the column "vendor" grouping by the same name and calculate the summary of "money" for each vendor.
2. Sub-Merge the column "type" grouping by the same name according the column "vendor" and calculate the summary of money for each type.
3. Sort the column "vendor" by the summary of "money" for each vendor.

Currently, the engine has implemented the first two requirements, but the last one is quite difficult, because:

1. The default behavior is sorting the vendor column by it's value or by a calculated value from the same row.

2. The default behavior of a merging column is sorting it first to optimize the operations aimed saving time and resources.

3. The summary of column is calculated when finished the merging operations, therefore the default behavior not support this case.

After the default operations, we got the table as :

image

According the requirement, there is an error in this result table, the summary of “money” for the vendor “D”is 1, it is less than the others, so this group should be the first one in this result table. Let’s find a way out.

We should do the following steps if we want implement this case:

1. Sort the column “vendor”by it’s value or name, so that we can merge it with the name of vendor. One vendor may contains some rows, we call those rows in one “group”.

2. Calculate the summary money for each group.

3. Sort the groups by the summary of “money ”in.

After a long time discuss and investigating( it takes almost three and half hours and I take sleeping at 2:00AM this morning), we realized that the only thing we need to do is changing the sort rule of the column “vendor” from the name of each vendor to the expression “summary money+name”, it is based on the following fact:

1. The summary money of each group is a certain value.

2. We should sort by the summary money first, and then use name to sort if there are two group  contains the same summary value.

3. If there are two or more groups has the same summary value, sort those by the name of vendor of each group.

The solution is:

1. Add a built-in function named “sumByField(keyColumn,valueColumn)” to calculate summary of “valueColumn”aligned with  “keyColumn”.

2. Add a built-in function named “leftpad(string,nLength)” which aimed to continuously add extra character to the head of the string till the length of the string equals nLength.

2. Modify the sort rule of column “vendor” with the value “=leftpad(sumByField(‘vendor’,’money’),20)+vendor”.

We got the correct result table as:

image 

 

Tired and feel good.


Jeason Zhao (沈胜衣,斛律光) ------雪饮再现,一个人的江湖
我知道我是谁,我是沈胜衣,默默的活着,就像空气。