PivotTables in Excel (or, cross tabulations) are quite useful. Has anyone already thought about how to implement a similar function in Mathematica?

I am not familiar with the use of pivot tables, but taking the example on the page linked above, I propose this:

```
Needs["Calendar`"]
key = # -> #2[[1]] & ~MapIndexed~
{"Region", "Gender", "Style", "Ship Date", "Units", "Price", "Cost"};
choices = {
{"North", "South", "East", "West"},
{"Boy", "Girl"},
{"Tee", "Golf", "Fancy"},
IntegerString[#, 10, 2] <> "/2011" & /@ Range@12,
Range@15,
Range[8.00, 15.00, 0.01],
Range[6.00, 14.00, 0.01]
};
data = RandomChoice[#, 150] & /@ choices // Transpose;
```

This creates `data`

that looks like:

```
{"East", "Girl", "Golf", "03/2011", 6, 12.29`, 6.18`},
{"West", "Boy", "Fancy", "08/2011", 6, 13.01`, 12.39`},
{"North", "Girl", "Golf", "05/2011", 1, 14.87`, 12.89`},
{"East", "Girl", "Golf", "09/2011", 3, 13.99`, 6.25`},
{"North", "Girl", "Golf", "09/2011", 13, 12.66`, 8.57`},
{"East", "Boy", "Fancy", "10/2011", 2, 14.46`, 6.85`},
{"South", "Boy", "Golf", "11/2011", 13, 12.45`, 11.23`}
...
```

Then:

```
h1 = Union@data[[All, "Region" /. key]];
h2 = Union@data[[All, "Ship Date" /. key]];
Reap[
Sow[#[[{"Units", "Ship Date"} /. key]], #[["Region" /. key]]] & ~Scan~ data,
h1,
Reap[Sow @@@ #2, h2, Total @ #2 &][[2]] &
][[2]];
TableForm[Join @@ %, TableHeadings -> {h1, h2}]
```

This is a rough example, but it gives an idea of how this may be done. If you have more specific requirements I will attempt to address them.

The `Manipulate`

block is largely copied, but I believe my `pivotTableData`

is more efficient, and I sought to localize symbols correctly, since this is now presented as usable code rather than a rough example.

I start with the same sample data, but I embed the field headings, since I feel this is more representative of normal use.

```
data = ImportString[#, "TSV"][[1]] & /@ Flatten[Import["http://lib.stat.cmu.edu/datasets/CPS_85_Wages"][[28 ;; -7]]];
data = Transpose[{
data[[All, 1]],
data[[All, 2]] /. {1 -> "South", 0 -> "Elsewhere"},
data[[All, 3]] /. {1 -> "Female", 0 -> "Male"},
data[[All, 4]],
data[[All, 5]] /. {1 -> "Union Member", 0 -> "No member"},
data[[All, 6]],
data[[All, 7]],
data[[All, 8]] /. {1 -> "Other", 2 -> "Hispanic", 3 -> "White"},
data[[All, 9]] /. {1 -> "Management", 2 -> "Sales", 3 -> "Clerical", 4 -> "Service", 5 -> "Professional", 6 -> "Other"},
data[[All, 10]] /. {0 -> "Other", 1 -> "Manufacturing", 2 -> "Construction"},
data[[All, 11]] /. {1 -> "Married", 0 -> "Unmarried"}
}];
PrependTo[data,
{"Education", "South", "Sex", "Experience", "Union", "Wage", "Age", "Race", "Occupation", "Sector", "Marriatal status"}
];
```

My `pivotTableData`

is self contained.

```
pivotTableData[data_, field1_, field2_, dependent_, op_] :=
Module[{key, sow, h1, h2, ff},
(key@# = #2[[1]]) & ~MapIndexed~ data[[1]];
sow = #[[key /@ {dependent, field2}]] ~Sow~ #[[key@field1]] &;
{h1, h2} = Union@data[[2 ;;, key@#]] & /@ {field1, field2};
ff = # /. {{} -> Missing@"NotAvailable", _ :> op @@ #} &;
{
{h1, h2},
Join @@ Reap[sow ~Scan~ Rest@data, h1, ff /@ Reap[Sow @@@ #2, h2][[2]] &][[2]]
}
]
```

`pivotTable`

relies only on `pivotTableData`

:

```
pivotTable[data_?MatrixQ] :=
DynamicModule[{raw, t, header = data[[1]], opList =
{Mean -> "Mean of \[Rule]",
Total -> "Sum of \[Rule]",
Length -> "Count of \[Rule]",
StandardDeviation -> "SD of \[Rule]",
Min -> "Min of \[Rule]",
Max -> "Max of \[Rule]"}},
Manipulate[
raw = pivotTableData[data, f1, f2, f3, op];
t = ConstantArray["", Length /@ raw[[1]] + 2];
t[[1, 1]] = Control[{op, opList}];
t[[1, 3]] = Control[{f2, header}];
t[[2, 1]] = Control[{f1, header}];
t[[1, 2]] = Control[{f3, header}];
{{t[[3 ;; -1, 1]], t[[2, 3 ;; -1]]}, t[[3 ;; -1, 3 ;; -1]]} = raw;
TableView[N@t, Dividers -> All],
Initialization :> {op = Mean, f1 = data[[1,1]], f2 = data[[1,2]], f3 = data[[1,3]]}
]
]
```

Use is simply:

```
pivotTable[data]
```

Here's what I've come up with. It uses the function SelectEquivalents defined in What is in your Mathematica tool bag?. Function1 and Function2 are meant to have different grouping possibilities of criteria1 and criteria2. FilterFunction is here in order to define an arbitrary filter formula on the data based on the header names.

Using the data example of Mr. Wizard here are some usages of this function.

```
criteria={"Region", "Gender", "Style", "Ship Date", "Units", "Price", "Cost"};
criteria1 = "Region";
criteria2 = "Ship Date";
consideredData = "Units";
PivotTable[data,criteria,criteria1,criteria2,consideredData]
```

A neat example

```
function2 = If[ToExpression@StringTake[#, 2] <= 6, "First Semester", "Second Semester"] &;
PivotTable[data,criteria,criteria1,criteria2,consideredData,FilterFunction->("Gender"=="Girl"&&"Units"*"Price"<=100&),Function2->function2]
```

Here's the definition of the function

```
keysToIndex[keys_] :=
Module[{keyIndex},
(keyIndex[#1] = #2[[1]])&~MapIndexed~keys;
keyIndex
];
InverseFlatten[l_,dimensions_]:= Fold[Partition[#, #2] &, l, Most[Reverse[dimensions]]];
Options[PivotTable]={Function1->Identity,Function2->Identity,FilterFunction->(True &),AggregationFunction->Total,FormatOutput->True};
PivotTable[data_,criteria_,criteria1_,criteria2_,consideredData_,OptionsPattern[]]:=
Module[{criteriaIndex, criteria1Index, criteria2Index, consideredDataIndex, criteria1Function, criteria2Function, filterFunctionTranslated, filteredResult, keys1, keys1Index, keys2, keys2Index, resultTable, function1, function2, filterFunction, aggregationFunction, formatOutput,p,sharp},
function1 = OptionValue@Function1;
function2 = OptionValue@Function2;
filterFunction = OptionValue@FilterFunction;
aggregationFunction = OptionValue@AggregationFunction;
formatOutput=OptionValue@FormatOutput;
criteriaIndex=keysToIndex[criteria];
criteria1Index=criteriaIndex@criteria1;
criteria2Index=criteriaIndex@criteria2;
consideredDataIndex=criteriaIndex@consideredData;
criteria1Function=Composition[function1,#[[criteria1Index]]&];
criteria2Function=Composition[function2,#[[criteria2Index]]&];
filterFunctionTranslated = filterFunction/.(# -> p[sharp, criteriaIndex@#]& /@ criteria /. sharp -> #)/.p->Part;
filteredResult=
SelectEquivalents[
data
,
TagElement->({criteria1Function@#,criteria2Function@#,filterFunctionTranslated@#}&)
,
TransformElement->(#[[consideredDataIndex]]&)
,
TagPattern->_?(#[[3]]&)
,
TransformResults->(Append[Most@#1,aggregationFunction@#2]&)
];
If[formatOutput,
keys1=filteredResult[[All,1]]//Union//Sort;
keys2=filteredResult[[All,2]]//Union//Sort;
resultTable=
SelectEquivalents[
filteredResult
,
TagElement->(#[[{1,2}]]&)
,
TransformElement->(#[[3]]&)
,
TagPattern->Flatten[Outer[List, keys1, keys2], 1]
,
FinalFunction-> (InverseFlatten[Flatten[#/.{}->Missing[]],{Length@keys1,Length@keys2}]&)
];
TableForm[resultTable,TableHeadings->{keys1,keys2}]
,
filteredResult
]
];
```

Use http://www.wolfram.com/products/applications/excel_link/ , this way you have the best of both worlds. This product creates a flawless link between Excel and mma, 2-ways.

A quick-and-dirty pivot table visualization:

I'll start with a more interesting real-life data set:

```
data = ImportString[#, "TSV"][[1]] & /@
Flatten[Import["http://lib.stat.cmu.edu/datasets/CPS_85_Wages"][[28 ;; -7]]
];
```

A bit of post-processing:

```
data =
{
data[[All, 1]],
data[[All, 2]] /. {1 -> "South", 0 -> "Elsewhere"},
data[[All, 3]] /. {1 -> "Female", 0 -> "Male"},
data[[All, 4]],
data[[All, 5]] /. {1 -> "Union Member", 0 -> "No member"},
data[[All, 6]],
data[[All, 7]],
data[[All, 8]] /. {1 -> "Other", 2 -> "Hispanic", 3 -> "White"},
data[[All, 9]] /. {1 -> "Management", 2 -> "Sales", 3 -> "Clerical",
4 -> "Service", 5 -> "Professional", 6 -> "Other"},
data[[All, 10]] /. {0 -> "Other", 1 -> "Manufacturing", 2 -> "Construction"},
data[[All, 11]] /. {1 -> "Married", 0 -> "Unmarried"}
}\[Transpose];
header = {"Education", "South", "Sex", "Experience", "Union", "Wage",
"Age", "Race", "Occupation", "Sector", "Marriatal status"};
MapIndexed[(headerNumber[#1] = #2[[1]]) &, header];
levelNames = Union /@ Transpose[data];
levelLength = Length /@ levelNames;
```

Now for the real stuff. It also uses the function `SelectEquivalents`

defined in What is in your Mathematica tool bag?

```
pivotTableData[levelName1_, levelName2_, dependent_, op_] :=
Table[
SelectEquivalents[data,
FinalFunction -> (If[Length[#] == 0, Missing["NotAvailable"], op[# // Flatten]] &),
TagPattern ->
_?(#[[headerNumber[levelName1]]] == levelMember1 &&
#[[headerNumber[levelName2]]] == levelMember2 &),
TransformElement -> (#[[headerNumber[dependent]]] &)
],
{levelMember1, levelNames[[headerNumber[levelName1]]]},
{levelMember2, levelNames[[headerNumber[levelName2]]]}
]
DynamicModule[
{opList =
{Mean ->"Mean of \[Rule]", Total ->"Sum of \[Rule]", Length ->"Count of \[Rule]",
StandardDeviation -> "SD of \[Rule]", Min -> "Min of \[Rule]",
Max -> "Max of \[Rule]"
}, t},
Manipulate[
t=Table["",{levelLength[[headerNumber[h1]]]+2},{levelLength[[headerNumber[h2]]]+2}];
t[[3 ;; -1, 1]] = levelNames[[headerNumber[h1]]];
t[[2, 3 ;; -1]] = levelNames[[headerNumber[h2]]];
t[[1, 1]] = Control[{op, opList}];
t[[1, 3]] = Control[{h2, header}];
t[[2, 1]] = Control[{h1, header}];
t[[1, 2]] = Control[{h3, header}];
t[[3 ;; -1, 3 ;; -1]] = pivotTableData[h1, h2, h3, op] // N;
TableView[t, Dividers -> All],
Initialization :> {op = Mean, h1 = "Sector", h2 = "Union", h3 = "Wage"}
]
]
```

There's still a bit of work to do. The `DynamicModule`

should be turned into a fully standalone function, with the header stuff more streamlined, but this should be sufficient for a first impression.

I little latter in the game. Here is another self contained solution with object like form.

Using random data created by @Mr.Wizard:

```
key = # -> #2[[1]] & ~MapIndexed~
{"Region", "Gender", "Style", "Ship Date", "Units", "Price", "Cost"};
choices = {
{"North", "South", "East", "West"},
{"Boy", "Girl"},
{"Tee", "Golf", "Fancy"},
IntegerString[#, 10, 2] <> "/2011" & /@ Range@12,
Range@15,
Range[8.00, 15.00, 0.01],
Range[6.00, 14.00, 0.01]
};
data = RandomChoice[#, 5000] & /@ choices // Transpose;
```

Using an `MapIndexed`

and `SparseArray`

as key functions, here is the code:

```
Options[createPivotTable]={"RowColValueHeads"-> {1,2,3},"Function"-> Total};
createPivotTable[data_,opts:OptionsPattern[{createPivotTable}]]:=Module[{r,c,v,aggDataIndex,rowRule,colRule,pivot},
{r,c,v}=OptionValue["RowColValueHeads"];
pivot["Row"]= Union@data[[All,r]];
pivot["Col"]= Union@data[[All,c]];
rowRule= Dispatch[#->#2[[1]]&~MapIndexed~pivot["Row"]];
colRule= Dispatch[#->#2[[1]]&~MapIndexed~pivot["Col"]];
aggDataIndex={#[[1,r]]/.rowRule,#[[1,c]]/.colRule}->OptionValue["Function"]@#[[All,v]]&/@GatherBy[data,#[[{r,c}]]&];
pivot["Data"]=Normal@SparseArray@aggDataIndex;
pivot["Properties"]={"Data","Row","Col"};
pivot["Table"]=TableForm[pivot["Data"], TableHeadings -> {pivot["Row"], pivot["Col"]}];
Format[pivot]:="PivotObject";
pivot
]
```

That you can use as:

```
pivot=createPivotTable[data,"RowColValueHeads"-> ({"Ship Date","Region","Units"}/.key)];
pivot["Table"]
pivot["Data"]
pivot["Row"]
pivot["Col"]
```

To get:

I believe that the speed is faster than @Ms.Wizard, but I have to make a better test, and don't have time now.

Similar Questions

I have a time-series like function which gives an output based on a list of real numbers(i.e. you can run the function on a list of 1,2,3,... real numbers). The issue I'm encountering is how to maximi

I am a big proponent of agile, but a friend of mine (who doesn't know agile yet - hes a managerial type ^^) asked me how I would go about planning and developing a complex distributed project, with a

I would like to know how to exchange data between Mathematica and a C/C++ with pipes. In the Mathematica tutorial it says that when you open a file or a pipe, Mathematica creates a 'stream object' th

I have this code: for each(var tool in tools){ tool.addEventListener(MouseEvent.MOUSE_DOWN, function(){ trace(tool); //Always the last tool }); } How do I bind the value of tool to the function so th

How would you test the following code with phpUnit? class Product { protected $id = null; public function __construct($id) { $this->id = $id; } public function getDescription($language_id) { $db_qu

I would like to call a matlab function from mathematica. How best to do that? I have found an ancient post on Wolfram site describing a way to do this, is this still the way to connect the two?

Using Git or Mercurial, how would you know when you do a clone or a pull, no one is checking in files (pushing it)? It can be important that: 1) You never know it is in an inconsistent state, so you t

How do I programmatically do the following: Change empty cells to 0 in a pivottable.

I know how to normalize a relational database. There are methodologies for getting to a fifth normal form. I understand the reasons why you may want to back off to fourth normal or otherwise. What is

I'm experimenting with OpenSSL on my network application and I want to test if the data sent is encrypted and can't be seen by eavesdropper. What tools can you use to check? Could this be done program

I have a function that I want to run at an interval within a frame. I use the following code to start the function: var intervalID = setInterval(intervalFunction, 3000); Then, in a button's onRelease

How do you change the CLASSPATH of a Java process from within the Java process? Before you ask me Why would you want to do that? I'll explain it shortly. When you have a Clojure REPL running it

I've done a lot of searching to try and find out how to create non-blocking code in Node.js. Unfortunately, every example I've found is grounded to a function that, in the end, already has a built in

Is it possible to use something like generate_n to create a const vector of, say, random numbers? I couldn't think of a way to do it without deriving vector and doing the assignment in the constructor

How do you initialize a variable in a stored procedure with a function? This doesn't work: /****** Object: StoredProcedure [dbo].[usp_ShowBackpopGaps] Script Date: 05/25/2011 19:57:23 ******/ SET ANSI

How would you declare a large number that is 128 bits in GMP with the #include <gmp.h>? This number is an integer.

I am trying to use CUSP as an external linear solver for Mathematica to use the power of the GPU. Here is the CUSP Project webpage. I am asking for some suggestion how we can integrate CUSP with Mathe

Given this function: function foo( a,b,c ) { // } How can you do something, say use console.log() for each argument? I know you can see the arguments by using the arguments keyword. Arguments seems t

I am novice with testing. When I develop my app, I use Robotium in order to test my apps, but now, I would like test some functions that are members of my Util class. For example: public static boolea

What is the best way to define a numerical constant in Mathematica? For example, say I want g to be the approximate acceleration due to gravity on the surface of the Earth. I give it a numerical value

For instance say I have a list with the following values {A, null, null, B, null, C, null, D, E, null} What would be the most efficient way to compress that to {A, B, C, D, E} Is

Say you had this text: SOMETHING_XXXXXXXXXXXXXX_ELSE SOMETHING_XXXXXXXXXXXXXX_ELSE2 SOMETHING_XXXXXXXXXXXXXX_ELSE3 SOMETHING_XXXXXXXXXXXXXX_ELSE4 And you wanted to replace all XXX..XXX with this word

In mathematica how would I find the sum of a list plugged into a function such as (15*(1-D)) when D=[0,.05,.065,.08,.10,.12,.14,.16,.18,.20]

I understand the core concept, but how do you use Qt table models? How would you go by creating a model class and finally does a QTableView auto-reload when data is changed. Thanks

Hey guys, I have a regular expression that is pretty long, and is hard to look at. i was wondering if you could help shorten it up, so it's more manageable. I admit, I'm not a regexp guru, and I just

I have some calculations with an arbitrary function. In the output, Mathematica always shows the function arguments. I would like to tidy the notation a bit, by hidding the arguments in the output. Ho

In Scala, how do you assign a function of a particular signature to an appropriately typed value? def foo = println(foo) def bar = println(bar) val fnRef : ()=>Unit = //the function named foo

Consider this simple fabric file: def wrapper(): f1() f2() @hosts([host1]) def f1(): run('ls') @hosts([host2]) def f2(): run('uname') By running fab wrapper you get a prompt for: No hosts found. Ple

Need some help in Unitestsing. (Assume I don't have a TypeMock) Would you change the code in order to inject mocks to substitute EndpointAddress, DiscoveryEndpoint, DiscoveryClient ? What kind of tes

I'd like to write command line scripts in Mathematica, but I can't seem to find an Argv[i_Integer] like function. (The docs are FANTASTIC otherwise.)

I have written a function in my .cs page. On the event onselectedindexchanged in .aspx I want to call this function. How do I do this?

Does anyone have any suggestions on how to justify read-only text (rendered into a TextBlock) in Silverlight 2? WPF supports text justification by way of the TextAlignment enumeration: public enum Tex

id is the only function of type a -> a, and fst the only function of type (a,b) -> a. In these simple cases, this is fairly straightforward to see. But in general, how would you go about proving

I have the expression D[f[x, y], x], and I want to substitute f[x,y] with x*y, I tried the following: D[f[x, y], x] /. {f[x,y] -> x*y} and D[f[x, y], x] /. {f -> x*y} But neither worked. Would a

I have a .dat file consisting of a string of numbers 1 to 9 that looks like this: 1 2 3 4 5 6 7 8 9 How do I import the data into mathematica so that it has the form {{1,2,3},{4,5,6},{7,8,9}} and I c

I've got some Buttons that I want to make un-clickable (but still appear on the screen) until another process (thread) has finished its work ? How would you do this ? The goal is to avoid my users to

I'm trying to plot a couple of UnitStep functions, but for some reason, Mathematica won't plot the whole function - Just the top (it doesn't look like a step, more like a line). How do I tell mathemat

I have a function that returns equalities, which I want to print, for example, x==y, or 2x+5==10. These usually have no meaning for mathematica, it cannot simplify it furhter. However, sometimes the b

Is it true that Mathematica's Minimize function does not allow constraints like Mod[x,2]==0? I am trying to solve a MinuteMath puzzle with Mathematica: What is the smallest possible average of four d

I want to use the Manipulate function in Mathematica to fit an analytical function to a set of (x,y) data. I want to plot the dataset on the same axes that I use to manipulate the function (so I can g

I'm slowly trying to understand and switch from MySQL to PDO / MySQLi and now I want to try how I would do with PDO. I would like to ask how you would use $_GET, which usually sets the User ID (numbe

I would like to know an easy way to implement a function playWav[filename_String] that plays a .wav file (as a side effect).

In developing a new web service I haven't been able to find very much information on how companies bill for their web services. Do you bill by request or only certain requests ie) GET or POST? -would

How would you parse the output from the linux command du -s using PHP so that you can get the disk usage in kilobytes? Example Output From du -s: du: cannot access `./proc/11918/task/11918/fd/4': No s

How would you say :not(this) in the case of this code: $(.draghandle).droppable({ accept: .draghandle:not(this) drop: function ( event, ui ) { $( this ) .addClass( ui-state-highlight ) .find( p

I wrote this code myself but I'm new with recursion and I need some help with changing this code so it would be recursive. I started out with a base case. I was trying to write code that checks two st

What is the cleanest and simplest way to bind a range of data dynamically to a pivottable? I have dynamically added data to existing excel .xlsx files for several years now with a ruby component I wro

The following code, how would you say in words? AudioInputStream cutStream = new AudioInputStream( new FileInputStream(inFile), audioFormat, sourceDataLine.getLongFramePosition());

It is possible to return from/exit the caller function i JavaScript? We are adding testing methods to our framework and we want to exit if we have an error on an Assertion. Given the following simple

On Stack Overflow, the profile page lists a last seen property. This doesn't seem to be updated on every page view (for performance reasons, obviously). How would you implement it in a heavy-traffic