Split, Apply, Merge in D

I wanted to find Groupby, a means to iterate a list in groups (lists of lists). In that search I came...

I wanted to find Groupby, a means to iterate a list in groups (lists of lists). In that search I came across this article about split, apply, merge for datatables. This looked like what I wanted, but it being specific to data science had me confused.

In D these function are chunkBy, map, joiner. The pattern of consistency continues as we just need to specify what to group on, once our list is sorted.

```dlang import std.algorithm;

auto data = [1,1,2,2]; assert(data.chunkBy!((a, b) => a==b) .equal!equal([[1,1],[2,2])); ```

Unlike previous lambdas, this one is taking two arguments, this allows for elements to be grouped in interesting ways.

```dlang import std.algorithm;

auto data = [1,1,2,2,3,3]; auto evenGrouping(int a, int b) { if(a%2 == b%2) return a < b; return a%2 < b%2; }

assert(data.sort!evenGrouping .chunkBy!((a,b) => a%2==b%2) .equal([[2,2],[1,1,3,3]])); ```

As mentioned sorting needs to happen first.

```dlang import std.algorithm; import std.range;

auto data = [3,3,1,1,2,2];

assert(data.sort!((a, b) => a%2 < b%2) .chunkBy!((a,b) => a%2==b%2) .map!(x => x.array.sort) .equal!equal([[2,2],[1,1,3,3]])); ```

In this contrived example I decided it best to run it through a compiler. It was a good thing as I found a difference in behavior. I'll save map for another day.

Two types of lambda functions are supplied to these functions. One takes a single argument which gets referred to as unary predicate and one that takes two which gets referred to as binary predicate.

When a unary predicate is supplied to chunkBy it returns a tuple of the quality found and the value. This is an interesting optimization but this overload should live with group which already has this behavior.

Split, Apply, Merge in D

jessekphillips

Comments section

Jesse Phillips

Senior Quality Assurance (Sdet) ¶ Avid Hobby D Programmer ¶ Telling People What To Do Because I Am Right.

1 Followers