Charter Documentation

The Charter repository is a set of three tools for analysing and visualising data in JavaScript and HTML:

Analyser: Processes data from CSV files and allows it to be analysed
Charter: Takes in formatted data and uses it to visualise data in HTML
Stats: A set of utility functions for analysing data

Charter is made available under the Hippocratic Licence.

Most of the examples in this documentation can be edited in place (look for the pencil icon), and their outputs will be displayed alongside. You can use console.log and console.table to display data, or just return what you want to see.

The example data used in this documentation, found in the files city example.csv, city example 2.csv, and city example 3.csv, looks like this:

city example.csv

NAME	COUNTRY	POPULATION	CAPITAL	PUBLIC_TRANSPORT	MAYOR_2012	MAYOR_2018
Auckland	New Zealand	1614		Bus,Train	Len Brown	Phil Goff
Taupō	Aotearoa	32.907		Bus,Train	Rick Cooper	David Trewavas
Hamburg	Germany	1810		Bus,Train,Ferry	Olaf Scholz	Katharina Fegebank,Peter Tschentscher
Sydney	Australia	4841		Bus,Train,Ferry	Clover Moore	Clover Moore
Hamilton	New Zealand	161.2		Bus	Julia Hardaker	Andrew King
Wellington	New Zealand	381.9	true	Bus,Train,Ferry,Cable Car	Celia Wade-Brown	Justin Lester
Christchurch	New Zealand	363.926		Bus	Bob Parker	Lianne Dalziel
Dunedin	New Zealand	114.347		Bus	Dave Cull	Dave Cull
Tauranga	New Zealand	110.338		Bus	Stuart Crosby	Greg Brownless

city example 2.csv

Name	Country	Population (thousands)
Semarang	Indonesia	"1,556"
Islamabad	Pakistan	1015
New Taipei City	Taiwan	"3,972"
Nagoya	Japan	2296

city example 3.csv

YEAR	POPULATION
1991	3516000
1992	3552200
1993	3597800
1994	3648300
1995	3706700
1996	3762300
1997	3802700
1998	3829200
1993	3851100
2000	3873100
2001	3916200
2002	3989500
2003	4061600
2004	4114300
2005	4161000
2006	4209100
2007	4245700
2008	4280300
2009	4332100
2010	4373900
2011	4399400
2012	4425900
2013	4475800
2014	4554600
2015	4647300
2016	4747200
2017	4844400

Installation

Charter uses ES6 modules. To use one of Charter's tools in your code, just import it like this:

import Charter from './charter/charter.js';
import Analyser from './charter/analyser.js';
import Stats from './charter/stats.js';

Analyser

Use

The first step to using Analyser is processing a file using the loadFile method. A dataConfig object representing the processed data will be made available to a callback function passed in to this method.

Through that object, you will have access to rows array, where each element is an array representing a row in the CSV, and a cols object you have defined for accessing specific columns.

You will also have access to a set of filter functions for filtering rows, which can make use of optional sets of aliases if appropriate for your data. See the dataConfig section for more information

Row methods vs. Analyser methods

Some methods documented here exist on the rows object made available through the dataConfig object. Each of these methods can also be called as a method of the Analyser object, by passing in the rows object as the first argument.

For example, the getCol function can be called as either Analyser.getCol(rows, colNum) or as rows.getCol(colNum)

Filter methods may also be called in two ways. They are also exposed on the rows object, but alternatively they are also exposed via the filters property of the dataConfig object, where they have slightly different names and expect a rows object as their first argument. These are available for backwards compatability purposes.

Both variant ways of calling these functions are documented here. The form of rows.getCol(colNum) is preferred, whereas the Analyser.getCol variant is maintained for backwards compatability. This primary form is used in the examples.

loadFile

loadFile(fileInfo1, fileConfig1, fileInfo2, fileConfig2, fileInfoN, fileConfigN, callback)

loadFile takes in one or more pairs of fileInfo and fileConfig arguments, followed by a single callback.

For each specified file, loadFile either requests a file at the specified path via a GET request, or processes a file that has already been loaded, such as via a file input. It parses the CSV, then processes it internally according to the fileConfig object passed in before passing the processed dataConfig object to a specified callback function.

When processing the data, Analyser will try to intelligently determine which cells contain numbers and which contain strings. Cells that seem to contain percentages will be converted to numbers, e.g. "50%" becomes 0.5.

When converting numbers, it will assume the . character is used as a decimal point, and the , character may be used when representing numbers as a string but will be ignored. Some cultures use these characters differently when representing numbers, for example three hundred thousand and a quarter could be represented as 300.000,25. This form of numeric representation is not supported, so if it is used in your data be sure to convert it before processing it with Analyser.

The strings "true" and "false" will also be converted to the JavaScript equivalent boolean values true and false, but any other strings (such as "True") will not be converted.

fileInfo arguments are either a string representing the URL of a CSV file to load or a File object.

fileConfig arguments are a JavaScript object containing information on how the data should be processed. See the fileConfig section for more information.

callback is a function that will receive a dataConfig object representing the processed data. See the dataConfig section for more information.

(city example.csv)

let fileConfig = {
	headerRows: 1,
	cols: Analyser.getColNumbers({
		NAME: 'A',
		COUNTRY: 'B',
		POPULATION: 'C',
		CAPITAL: 'D',
		PUBLIC_TRANSPORT: 'E',
		MAYOR_2012: 'F',
		MAYOR_2018: 'G'
	}),
	arrayCols: {},
	defaultColValues: {},
	defaultCols: {},
	aliases: {
		COUNTRY: [
			['New Zealand', 'Aotearoa']
		]
	},
	enumsMap: {}
};

fileConfig.arrayCols[fileConfig.cols.PUBLIC_TRANSPORT] = ',';
fileConfig.arrayCols[fileConfig.cols.MAYOR_2018] = ',';

fileConfig.defaultColValues[fileConfig.cols.CAPITAL] = 'No';

fileConfig.defaultCols[fileConfig.cols.CAPITAL] = false;

fileConfig.enumsMap.MAYOR = [fileConfig.cols.MAYOR_2012, fileConfig.cols.MAYOR_2018];

let exploreData = function (dataConfig) {
	let rows = dataConfig.rows;
	let cols = dataConfig.cols;

	// Do stuff with the data here

	let table = rows.createSubTableString(cols);
	console.log(table);
};

Analyser.loadFile('/charter/app/docs/data/city example.csv', fileConfig, exploreData);

NAME COUNTRY POPULATION CAPITAL PUBLIC_TRANSPORT MAYOR_2012 MAYOR_2018 Auckland New Zealand 1614 false Bus,Train Len Brown Phil Goff Taupō Aotearoa 32.907 false Bus,Train Rick Cooper David Trewavas Hamburg Germany 1810 false Bus,Train,Ferry Olaf Scholz Katharina Fegebank,Peter Tschentscher Sydney Australia 4841 false Bus,Train,Ferry Clover Moore Clover Moore Hamilton New Zealand 161.2 false Bus Julia Hardaker Andrew King Wellington New Zealand 381.9 true Bus,Train,Ferry,Cable Car Celia Wade-Brown Justin Lester Christchurch New Zealand 363.926 false Bus Bob Parker Lianne Dalziel Dunedin New Zealand 114.347 false Bus Dave Cull Dave Cull Tauranga New Zealand 110.338 false Bus Stuart Crosby Greg Brownless

NAME	COUNTRY	POPULATION	CAPITAL	PUBLIC_TRANSPORT	MAYOR_2012	MAYOR_2018
Auckland	New Zealand	1614	false	Bus,Train	Len Brown	Phil Goff
Taupō	Aotearoa	32.907	false	Bus,Train	Rick Cooper	David Trewavas
Hamburg	Germany	1810	false	Bus,Train,Ferry	Olaf Scholz	Katharina Fegebank,Peter Tschentscher
Sydney	Australia	4841	false	Bus,Train,Ferry	Clover Moore	Clover Moore
Hamilton	New Zealand	161.2	false	Bus	Julia Hardaker	Andrew King
Wellington	New Zealand	381.9	true	Bus,Train,Ferry,Cable Car	Celia Wade-Brown	Justin Lester
Christchurch	New Zealand	363.926	false	Bus	Bob Parker	Lianne Dalziel
Dunedin	New Zealand	114.347	false	Bus	Dave Cull	Dave Cull
Tauranga	New Zealand	110.338	false	Bus	Stuart Crosby	Greg Brownless

combineData

combineData(...dataConfigs)

combineData takes in any number of dataConfig objects, and outputs a single combined dataConfig object.

All rows and relevant associated objects (e.g. aliases, enums) are combined, with the assumption that there are no rows duplicated between different configs. Only columns shared by each dataConfig object's set of columns are kept in the combined output; any columns that are not shared by each dataConfig will be discarded.

(city example.csv, city example 2.csv)

let fileConfigA = {
	headerRows: 1,
	cols: Analyser.getColNumbers({
		NAME: 'A',
		COUNTRY: 'B',
		POPULATION: 'C',
		CAPITAL: 'D'
	}),
	aliases: {
		COUNTRY: [
			['New Zealand', 'Aotearoa']
		]
	}
};
let filePathA = '/charter/app/docs/data/city example.csv';

let fileConfigB = {
	headerRows: 1,
	cols: Analyser.getColNumbers({
		NAME: 'A',
		COUNTRY: 'B',
		POPULATION: 'C'
	})
};
let filePathB = '/charter/app/docs/data/city example 2.csv';

let filesLoaded = function (dataConfigA, dataConfigB) {
	let combinedDataConfig = Analyser.combineData(dataConfigA, dataConfigB);
	analyseCombinedData(combinedDataConfig);
};

let analyseCombinedData = function (dataConfig) {
	let rows = dataConfig.rows;
	let cols = dataConfig.cols;

	// Do stuff with the combined data from both files here

	let table = rows.createSubTableString(cols);
	console.log(table);
};

Analyser.loadFile(
	filePathA, fileConfigA,
	filePathB, fileConfigB,
	filesLoaded
);

NAME COUNTRY POPULATION Auckland New Zealand 1614 Taupō Aotearoa 32.907 Hamburg Germany 1810 Sydney Australia 4841 Hamilton New Zealand 161.2 Wellington New Zealand 381.9 Christchurch New Zealand 363.926 Dunedin New Zealand 114.347 Tauranga New Zealand 110.338 Semarang Indonesia 1556 Islamabad Pakistan 1015 New Taipei City Taiwan 3972 Nagoya Japan 2296

NAME	COUNTRY	POPULATION
Auckland	New Zealand	1614
Taupō	Aotearoa	32.907
Hamburg	Germany	1810
Sydney	Australia	4841
Hamilton	New Zealand	161.2
Wellington	New Zealand	381.9
Christchurch	New Zealand	363.926
Dunedin	New Zealand	114.347
Tauranga	New Zealand	110.338
Semarang	Indonesia	1556
Islamabad	Pakistan	1015
New Taipei City	Taiwan	3972
Nagoya	Japan	2296

getColNumber

getColNumber(colName)

Converts the letter-based name of a column, as commonly used in spreadsheet software, into an integer index.

If a value passed in cannot be converted in this way, null will be returned.

Typically, it is easier to use getColNumbers when creating a cols object for a fileConfig.

colName is a string representing the letter-based name of a column. It is not case sensitive.

let cols = {
	NAME: Analyser.getColNumber('A'),
	COUNTRY: Analyser.getColNumber('B'),
	POPULATION: Analyser.getColNumber('C'),
	OTHER_COL: Analyser.getColNumber('HV'),
	BROKEN_COL: Analyser.getColNumber('This is broken because it contains spaces')
};

console.log(cols);

getColNumbers

getColNumbers(colObject)

Takes in a flat object where the value of each property is a string that could be parsed by getColNumber, and returns a new object in which each property has been transformed into the appropriate integer index.

This function is typically only used to create the cols object for a fileConfig.

Non-negative integers will not be converted, and will maintain their value. Anything else will not be included in the object returned by getColNumbers.

let cols = Analyser.getColNumbers({
	NAME: 'A',
	COUNTRY: 'B',
	POPULATION: 'C',
	OTHER_COL: 'HV',
	BROKEN_COL: 'This is broken because it contains spaces',
	NON_NEGATIVE_INTEGERS_WORK: 3
});

console.log(cols);

getCol

rows.getCol(colNum) Analyser.getCol(rows, colNum)

Returns a single array representing the values of a single column for a set of rows of processed data.

rows is an array of rows from a dataConfig object.

colNum is the index of a column, typically represented by an element of the cols object from a dataConfig object.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let cityNames = rows.getCol(cols.NAME);

console.log(cityNames);

addCol

rows.addCol(col) Analyser.addCol(rows, col)

Adds a new column to the set of rows, and returns the index of the new column so it can be added to the cols object and used to continue accessing the new column.

rows is an array of rows from a dataConfig object.

col is an array of the same length as rows.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let previousYearPopulation = rows.getCol(cols.POPULATION);

previousYearPopulation.pop();
rows.shift();

cols.POPULATION_PREVIOUS = rows.addCol(previousYearPopulation);

let subCols = {
	NAME: cols.NAME,
	POP: cols.POPULATION,
	POP_PREV: cols.POPULATION_PREVIOUS
};

let table = rows.createSubTable(subCols);
console.table(table);

getDerivedCol

rows.getDerivedCol(processFn, ...cols) Analyser.getDerivedCol(rows, processFn, ...cols)

Creates a column of data that is the result of passing an individual row and any number of optional values from specified columns into a processing function. This does not modify the existing rows data.

rows is an array of rows from a dataConfig object.

processFn is a function of the form fn(row, optionalValue1, optionalValue2, optionalValueN)

...cols is any number of column arrays passed as additional arguments. These can be created for example by the getCol or getDerivedCol methods. Because columns that already exist as part of the rows object, and can therefore be accessed that way, these parameters are only necessary for previously created derived columns that have not been added to the rows via addDerivedCol.

The values passed through to the optionalValue1, optionalValue2, optionalValueN arguments correspond to the values at the current row for the optional specified columns.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let getRawPopulation = function (row) {
	// The population column in the spreadsheet is in thousands
	return row[cols.POPULATION] * 1000;
};

let rawPopulation = rows.getDerivedCol(getRawPopulation);

console.log(rawPopulation);

addDerivedCol

rows.addDerivedCol(processFn, ...cols) Analyser.addDerivedCol(rows, processFn, ...cols)

Creates a new column as per getDerivedCol, then passes it in to addCol and returns the index of the new column.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let getRawPopulation = function (row) {
	// Population stored in thousands
	return row[cols.POPULATION] * 1000;
};

cols.POP_RAW = rows.addDerivedCol(getRawPopulation);

let summaryCols = {
	NAME: cols.NAME,
	POP_RAW: cols.POP_RAW
};

let table = rows.createSubTable(summaryCols);
console.table(table);

createSubTable

rows.createSubTable(cols) Analyser.createSubTable(rows, cols)

Creates an object suitable for passing into console.table, using the rows in rows and the columns defined in cols. Depending on the size of your data, it may be useful to create a smaller cols object that does not contain every column to pass in here.

rows is an array of rows from a dataConfig object.

cols is a columns object as created for a fileConfig object.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let summaryCols = {
	NAME: cols.NAME,
	POPULATION: cols.POPULATION
};

let summaryTable = rows.createSubTable(summaryCols);
console.table(summaryTable);

createSubTableString

rows.createSubTableString(cols) Analyser.createSubTableString(rows, cols)

Calls createSubTable and converts the result into a string that separates cells by tabs and rows by newlines, so it can be copied and pasted into a spreadsheet.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let summaryCols = {
	NAME: cols.NAME,
	POPULATION: cols.POPULATION
};

let summaryTableString = rows.createSubTableString(summaryCols);
console.log(summaryTableString);

getColSummary

rows.getColSummary(cols, aliasList) Analyser.getColSummary(rows, cols, aliasList)

Returns an object containing a count of each time a value occurred in one or more columns, optionally counting values as being the same if specified in an optional set of aliases.

rows is an array of rows from a dataConfig object.

cols is either the index of a single column, or an array of indices of multiple columns. If an array is passed in, the summary object will contain a combined count for all specified columns.

aliasList (optional) is an object specifying aliases as used in a fileConfig object. If an aliasList is passed in, the count in the summary object will combine values in a single alias into the same count, and report them using the label of the alias.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;
let aliases = dataConfig.aliases;

let countrySummary = rows.getColSummary(cols.COUNTRY);
console.log(countrySummary);

let countrySummaryWithAliases = rows.getColSummary(cols.COUNTRY, aliases.COUNTRY);
console.log(countrySummaryWithAliases);

getColAsDataSeries

rows.getColAsDataSeries(col, labels) Analyser.getColAsDataSeries(rows, col, labels)

Returns a dataSeries array as used by Charter. Each element in the output array will be the count of the number of rows containing a value in the specified column that matches the value in the element of the labels array at the same index.

rows is an array of rows from a dataConfig object.

col is the index of a single column.

labels is an array of labels, where each label is a value that appears in some cells in the specified column.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let labels = rows.getCol(cols.COUNTRY);

let dataSeries = rows.getColAsDataSeries(cols.COUNTRY, labels);
console.log(dataSeries);

getComparisonSummary

rows.getComparisonSummary(headerCol, headerAliases, varCol, varAliases) Analyser.getComparisonSummary(rows, headerCol, headerAliases, varCol, varAliases)

Creates an object that can be used with console.table with the values of headerCol used in the header, and the values of varCol used for each row, with the cells denoting the number of times these values coincided.

rows is an array of rows from a dataConfig object.

headerCol is the index of the column whose values will be used for each column in the comparison summary table.

headerAliases (optional) is a set of aliases as used for a fileConfig object, to be applied to the values of headerCol.

varCol is the index of a column whose values will be used for each row in the comparison summary table;

varAliases (optional) is a set of aliases as used for a fileConfig object, to be applied to the values of varCol.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;
let aliases = dataConfig.aliases;

let capitalTable = rows.getComparisonSummary(cols.COUNTRY, aliases.COUNTRY, cols.CAPITAL);

console.table(capitalTable);

getComparisonSummaryString

rows.getComparisonSummaryString(headerCol, headerAliases, varCol, varAliases) Analyser.getComparisonSummaryString(rows, headerCol, headerAliases, varCol, varAliases)

Calls getComparisonSummary and converts the result into a string that separates cells by tabs and rows by newlines, so it can be copied and pasted into a spreadsheet.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;
let aliases = dataConfig.aliases;

let capitalTableString = rows.getComparisonSummaryString(cols.COUNTRY, aliases.COUNTRY, cols.CAPITAL);

console.log(capitalTableString);

saveComparisonSummaryCsv

saveComparisonSummaryCsv(filename, rows, headerCol, headerAliases, varCol, varAliases)

Calls getComparisonSummary and converts the result into a CSV, which is then automatically downloaded.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;
let aliases = dataConfig.aliases;

let capitalTableString = Analyser.saveComparisonSummaryCsv('Capital cities.csv', rows, cols.COUNTRY, aliases.COUNTRY, cols.CAPITAL);

fileConfig

A fileConfig object is required by the loadFile function, and is used to determine how the data in a file is processed. It contains the following properties:

headerRows

headerRows (optional) is the number of rows at the top of the CSV file that are not part of the data. These rows will be ignored when Analyser processes the file.

If not included, defaults to 0.

footerRows

footerRows (optional) is the number of rows at the bottom of the CSV file that are not part of the data. These rows will be ignored when Analyser processes the file.

If not included, defaults to 0.

cols

cols is an object where each key is the label to use for a column, and the value is the index of the column. Analyser.getColNumbers can be used to convert letter-based spreadsheet column labels to numbers.

You don't need to include every column that is in a spreadsheet in the cols object. Any that you don't include will be ignored when processing the data, and not included in the output.

let cols = Analyser.getColNumbers({
	NAME: 'A',
	COUNTRY: 'B',
	POPULATION: 'C',
	CAPITAL: 'D'
});

arrayCols

arrayCols is an object where each element's key is the index of a column, and its value is either null or a string.

It represents the columns in a CSV whose cells contain multiple values, separated by some delimiter. The value of an arrayCols element is the delimiter used to separate values in these cells. If a delimiter is not defined here, a single space character is used by default.

// Assuming cells in the PUBLIC_TRANSPORT column
// can have values such as 'Bus,Train'

let arrayCols = {};
arrayCols[fileConfig.cols.PUBLIC_TRANSPORT] = ',';

defaultColValues

defaultColValues is an object where each element's key is the index of a column, and its value is any value.

It represents the default values in each column of a CSV which should be ignored. For example, if the character "-" is used to represent 0 in data counting a number of incidents, then that value should be ignored. The defaultColValues object can be used in conjunction with defaultCols to assign a useful value to these cells in cases where they should not just be ignored.

If a column is set as an array column, any specified default value will be ignored.


let defaultColValues = {};
defaultColValues[fileConfig.cols.CAPITAL] = '-';

defaultCols

defaultCols is an object where each element's key is the index of a column, and its value is any value.

It represents the columns in a CSV whose cells should have a default value in the case that the cell in the CSV is empty. For example, where numeric data is represented in a CSV, an empty cell may signify a value of 0.

If a default value is not specified, the value of an empty cell will be represented as an empty string.

If a column is set as an array column, any specified default value will be ignored and an empty array will be used instead.


let defaultCols = {};
defaultCols[fileConfig.cols.CAPITAL] = false;

aliases

aliases (optional) is an object where each property, which must share its name with a column as defined in the cols object, is an array of arrays. Each array represents a set of values that should be considered to belong to the same set.

The first value in an array of aliases should be considered the label of the set of aliases. This label does not need to appear in the data itself.

A value can appear in multiple alias arrays. If it does, it may be counted multiple times as it will be considered a member of all sets simultaneously.

let aliases = {
	COUNTRY: [
		['New Zealand', 'Aotearoa']
	]
};

enumsMap

enumsMap (optional) is an object where each property, which must share its name with a column as defined in the cols object, is an array of column indices. This map is used when creating the enums object available when analysing data, and tells the processor that enums from the columns in each array should be combined into the same set.

In this example, with data from city example.csv, instead of collecting a separate set of enums for both the MAYOR_2012 and MAYOR_2018 columns, Analyser collects a single set of enums labelled MAYOR. This way there is a single list, and values that exist in both MAYOR_2012 and MAYOR_2018 columns, such as "Dave Cull", exist only once in the combined set of enums.

The enums that are collected in data processing this way can be useful in generating labels for graphs, for example.

(city example.csv)

let fileConfig = {
	headerRows: 1,
	cols: Analyser.getColNumbers({
		NAME: 'A',
		COUNTRY: 'B',
		POPULATION: 'C',
		CAPITAL: 'D',
		PUBLIC_TRANSPORT: 'E',
		MAYOR_2012: 'F',
		MAYOR_2018: 'G'
	}),
	arrayCols: {},
	defaultCols: {},
	aliases: {
		COUNTRY: [
			['New Zealand', 'Aotearoa']
		]
	},
	enumsMap: {}
};

fileConfig.arrayCols[fileConfig.cols.PUBLIC_TRANSPORT] = ',';
fileConfig.arrayCols[fileConfig.cols.MAYOR_2018] = ',';

fileConfig.defaultCols[fileConfig.cols.CAPITAL] = false;

fileConfig.enumsMap.MAYOR = [fileConfig.cols.MAYOR_2012, fileConfig.cols.MAYOR_2018];

let exploreData = function (dataConfig) {
	console.log(dataConfig.enums);
};

Analyser.loadFile(baseUrl + 'docs/data/city example.csv', fileConfig, exploreData);

// dataConfig.enums { 'CAPITAL': [ 'false', 'true' ], 'COUNTRY': [ 'New Zealand', 'Aotearoa', 'Germany', 'Australia' ], 'MAYOR': [ 'Len Brown', 'Phil Goff', 'Rick Cooper', 'David Trewavas', 'Olaf Scholz', 'Katharina Fegebank', 'Peter Tschentscher', 'Clover Moore', 'Julia Hardaker', 'Andrew King', 'Celia Wade-Brown', 'Justin Lester', 'Bob Parker', 'Lianne Dalziel', 'Dave Cull', 'Stuart Crosby', 'Greg Brownless' ], NAME: [ 'Auckland', 'Taupō', 'Hamburg', 'Sydney', 'Hamilton', 'Wellington', 'Christchurch', 'Dunedin', 'Tauranga' ], 'POPULATION': [ 1614, 32.907, 1810, 4841, 161.2, 381.9, 363.926, 114.347, 110.338 ], 'PUBLIC_TRANSPORT': [ 'Bus', 'Train', 'Ferry', 'Cable Car' ] }

uniqueCols

uniqueCols (optional) is an array where each element is the index of a column. All columns included in this array will not have their enums gathered.

This is particularly useful for ID columns, as well as columns that may not have unique values between each column but still do not need to be collated, such as columns containing dates.

let uniqueCols = [
	fileConfig.cols.NAME,
	fileConfig.cols.MAYOR,
	fileConfig.cols.POPULATION
];

dataConfig

A dataConfig object is created by data processing functions, and is used for analysing processed data. It contains the following properties:

cols

cols is the cols object from the fileConfig passed in to loadFile.

rows

rows is an Array-like object, where each element is an array that represents a row in the processed CSV and each element in an inner array represents a cell in the processed CSV. The index of each element in a row is determined by the value associated with its column in the cols object, and should be accessed using it:

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let firstRowName = rows[0][cols.NAME];

console.log(firstRowName);

aliases

aliases is the aliases object from the fileConfig used to process this data. See the section on filters for examples on how aliases can be used once the data has been processed.

enums

enums is an object where each element, which shares a name with each element of the cols object, is an array of all values that can be found in that column. If an enumsMap is used, enums collected for multiple columns can be combined into one set.

(city example.csv)

// dataConfig.enums
{
	NAME: [
		'Auckland', 'Taupō', 'Hamburg', 'Sydney', 'Hamilton', 'Wellington', 'Christchurch', 'Dunedin', 'Tauranga'
	],
	'COUNTRY': [
		'New Zealand', 'Aotearoa', 'Germany', 'Australia'
	],
	'POPULATION': [
		1614, 32.907, 1810, 4841, 161.2, 381.9, 363.926, 114.347, 110.338
	],
	'CAPITAL': [
		'false', 'true'
	],
	'PUBLIC_TRANSPORT': [
		'Bus', 'Train', 'Ferry', 'Cable Car'
	],
	'MAYOR': [
		'Len Brown', 'Phil Goff', 'Rick Cooper', 'David Trewavas', 'Olaf Scholz', 'Katharina Fegebank', 'Peter Tschentscher', 'Clover Moore', 'Julia Hardaker', 'Andrew King', 'Celia Wade-Brown', 'Justin Lester', 'Bob Parker', 'Lianne Dalziel', 'Dave Cull', 'Stuart Crosby', 'Greg Brownless'
	]
}

filters

A set of filter functions are exposed on the rows object.

These functions are also exposed via the filters property on the dataConfig object, where they expect a rows object to be passed as their first argument.

These filter functions use the aliases from the fileConfig object used to process this data. The filter functions available are:

filter

rows.filter(orToggle, colIndex1, values1, colIndex2, values2, colIndexN, valuesN) dataConfig.filters.filterRows(rows, orToggle, colIndex1, values1, colIndex2, values2, colIndexN, valuesN)

Filters a set of rows, using either an OR filter or an AND filter, by looking at the values of one or more specified columns. These column indices and values are passed as one or more pairs.

If a column was specified in the arrayCols object in the fileConfig that informed how this data was processed, filter will check all values in each of its cells, and if any of them match then the row will pass the filter.

It returns a new rows array.

rows is an array of rows from a dataConfig object.

orToggle (optional) is a boolean value specifying whether or not the filter should be an OR filter. Defaults to false if not passed.

colIndex1, …, colIndex2, …, colIndexN are each the index of columns. Any number of them may be passed, but they must each be matched by a values argument.

values1, …, values2, …, valuesN are each either a single value to filter by, an array of values to filter by (if the value of a cell matches any value in this array, it will pass the filter), or a function that takes in the value of a cell and returns a boolean value.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

// Filtering a single column by a single value, using aliases
let newZealandCities = rows.filter(
	cols.COUNTRY, 'New Zealand'
);
console.log(newZealandCities.getCol(cols.NAME));

let rows = dataConfig.rows;
let cols = dataConfig.cols;

// Filtering an array column
let citiesWithTrains = rows.filter(
	cols.PUBLIC_TRANSPORT, 'Train'
);
console.log(citiesWithTrains.getCol(cols.NAME));

let rows = dataConfig.rows;
let cols = dataConfig.cols;

// Filtering a single column by multiple values
let australasiaCities = rows.filter(
	cols.COUNTRY, ['New Zealand', 'Australia']
);
console.log(australasiaCities.getCol(cols.NAME));

let rows = dataConfig.rows;
let cols = dataConfig.cols;

// Filtering with a function
let largerCities = rows.filter(
	cols.POPULATION, a => a >= 300
);
console.log(largerCities.getCol(cols.NAME));

let rows = dataConfig.rows;
let cols = dataConfig.cols;

// Applying multiple filters (AND)
let largeCapitalCities = rows.filter(
	cols.POPULATION, a => a >= 300,
	cols.CAPITAL, a => a === 'true'
);
console.log(largeCapitalCities.getCol(cols.NAME));

let rows = dataConfig.rows;
let cols = dataConfig.cols;

// Applying multiple filters (OR)
let veryLargeOrCapitalCities = rows.filter( true,
	cols.POPULATION, a => a >= 1000,
	cols.CAPITAL, a => a === 'true'
);
console.log(veryLargeOrCapitalCities.getCol(cols.NAME));

filterAnd

rows.filterAnd(colIndex1, values1, colIndex2, values2, colIndexN, valuesN) dataConfig.filters.filterRowsAnd(rows, colIndex1, values1, colIndex2, values2, colIndexN, valuesN)

Identical to the filter function, but without the andToggle argument. Because this parameter of filter defaults to true, filter can be used in exactly the same way as filterAnd.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let newZealandCapital = rows.filterAnd(
	cols.COUNTRY, 'New Zealand',
	cols.CAPITAL, a => a === 'true'
);
console.log(newZealandCapital.getCol(cols.NAME));

filterOr

filterOr(colIndex1, values1, colIndex2, values2, colIndexN, valuesN) dataConfig.filters.filterRowsOr(rows, colIndex1, values1, colIndex2, values2, colIndexN, valuesN)

Identical to the filter function, but without the andToggle argument and applying an OR filter.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let largeOrCapitalCities = rows.filterOr(
	cols.POPULATION, a => a >= 1000,
	cols.CAPITAL, a => a === 'true'
);
console.log(largeOrCapitalCities.getCol(cols.NAME));

Charter

Use

Charter contains methods for generating HTML charts. These charts are intended to be accompanied by the CSS generated for them via _chart.scss.

These methods take in a chartData object, a numericAxisConfig for the independent axis, and a qualitativeAxisConfig for the dependent axis. The chartData object in turn contains an array of dataSeries objects.

Some properties of these objects are only used by some types of charts.

chartData

{
	title: 'Chart title',
	showLegend: true,
	showTooltips: true,
	labels: ['Label 1', 'Label 2'],
	dataSeries: [
		{
			name: 'dataSeriesName',
			color: '#fff',
			dataPoints: [1, 2]
		}
	],
	stacked: true,
	horizontal: true,
	smoothing: 2
}

title is a string that will appear above the chart.

showLegend is a boolean value that, if true, will cause a legend to be rendered between the title and the chart.

showTooltips is a boolean value used only for bar charts. If true, the value of each bar will always be display at the end of each bar.

If false, this value will only be shown when a bar is hovered over or is clicked on (the tooltip will remain shown so long as the bar retains keyboard focus).

labels is an array of strings to be used as the labels on the independent axis.

dataSeries is an array of dataSeries objects.

stacked is a boolean value used only for bar charts. If true, multiple dataSeries will be displayed as bars stacked on top of one another, instead the default display of having them next to each other within the same label.

horizontal is a boolean value used only for bar charts. If true, the independent axis will be the vertical axis, with bars extending from left to right instead of from top to bottom.

smoothing is an integer used only for line graphs and scatter plots. If set, the data will be smoothed using a rolling average of the specified size.

dataSeries

{
	name: 'dataSeriesName',
	color: '#fff',
	dataPoints: [1, 2]
}

name is a string that will be used in a legend if one is shown.

color is the colour used to denote the data in this dataSeries on the chart. If unset, it will default to the colour set in the CSS (#999 in this project).

dataPoints is an array where each element is a number or an object, representing the data you want to draw on the chart.

If a dataPoint is a number, it will inherit its colour from its dataSeries.

If a dataPoint is an object, it expects to have its number specified as the value property. It will inherit its colour from its dataSeries by default, but this can be overridden via a color property.

In a line graph, specifying the colour of an individual dataPoint will affect the colour of the hover/focus bubble and tooltip border for that point, but not the line itself.

numericAxisConfig

{
	label: 'Axis title',

	values: 5,
	valuesAt: [],
	gridlines: null,

	toFixed: 0,
	percentage: false,

	roundTo: null,

	min: 0,
	max: null
}

label is a string that will be used for the label of the axis.

values is an integer, representing the number of values to show on the axis. This number excludes the minimum value, which will always be shown. These values will be evenly spread between the maxiumum and the minimum. values must be at least 1, which would result in only the minimum and maximum values being shown.

valuesAt is an array that can be used to specify specific values should be shown on the axis. These values are displayed in addition to those already displayed based on the values option.

gridlines is an integer, representing the number of evenly gridlines to be displayed. By default, this will be the same as the value of values. Additional values inserted via the valuesAt option will always get their own gridline, regardless of this option.

toFixed is an integer, representing the number of significant figures to display for the numbers in the graph.

percentage is a boolean, representing whether or not the numbers in the graph should be treated as percentages. If true, numbers between 0 and 1 will be treated as though they were between 0% and 100%, and the labels will be updated accordingly.

roundTo is a number, which all the values labeled on the axis should be round to. By default, it will be calculated as the power of 10 at the same order of magnitude as the maximum value. For example, if the largest value is 4,841, then roundTo would automatically be set to 1,000.

min is the minimum number to display on the axis. If set to null, it will be calculated based on the data displayed in the graph.

max is the maximum number to display on the axis. By default, or if set to null, it will be calculated based on the data displayed in the graph.

qualitativeAxisConfig

{
	label: 'Axis title',

	valuesEvery: 1,
	valuesSkip: 0,

	gridlinesEvery: null,
	gridlinesSkip: null
}

label is a string that will be used for the label of the axis.

valuesEvery is an integer, representing the rate at which a label should be shown for each value along the axis. If set to 1, all values will have a label. If set to 2, only every second value will have a label.

valuesSkip is an integer, representing the number of labels to skip at the start of the axis. Not supported for bar graphs.

gridlinesEvery is an integer, representing the rate at which gridlines should be shown for each value along the axis. If set to null, its value will be inherited from valuesEvery.

gridlinesSkip is an integer, representing the number of gridlines to skip at the start of the axis. If set to null, its value will be inherited from valuesSkip. Not supported for bar graphs.

createTable

createTable(rows, cols)

Creates an HTML string of a table containing the data it received.

rows is an array of rows from a dataConfig object.

cols is a columns object as created for a fileConfig object.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let table = Charter.createTable(rows, cols);

return table;

createBarChart

createBarChart(chartData, dependentAxisConfig, independentAxisConfig)

Creates a jQuery object of an element containiner a bar chart, which can be inserted into the DOM.

chartData is a chartData object.

dependentAxisConfig is a numericAxisConfig object.

independentAxisConfig is a qualitativeAxisConfig object.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let cityNames = rows.getCol(cols.NAME);

let chartData = {
	title: 'City Populations',
	showTooltips: true,
	labels: cityNames,
	dataSeries: [
		{
			dataPoints: rows.getCol(cols.POPULATION)
		}
	]
};

let dependentAxisConfig = {
	label: 'Population',
	horizontal: true,
	roundTo: 100,
	max: 6000,
	values: 2
};

let chart = Charter.createBarChart(chartData, dependentAxisConfig);

return chart;

createLineGraph

createLineGraph(chartData, dependentAxisConfig, independentAxisConfig)

Creates a jQuery object of an element containiner a line graph, which can be inserted into the DOM.

chartData is a chartData object.

dependentAxisConfig is a numericAxisConfig object.

independentAxisConfig is a qualitativeAxisConfig object.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let years = rows.getCol(cols.YEAR);
let population = rows.getCol(cols.POPULATION);

let chartData = {
	title: 'Auckland city population over time',
	showLegend: true,
	labels: years,
	dataSeries: [
		{
			name: 'Population',
			color: '#f00',
			dataPoints: population
		},
		{
			name: 'Population linear fit',
			color: 'rgba(0, 0, 0, 0.5)',
			dataPoints: Stats.linearLeastSquares(population)
		}
	]
};

let dependentAxisConfig = {
	label: 'Population',
	values: 5,
	roundTo: 1000000,
	min: null
};

let independentAxisConfig = {
	label: 'Time',
	valuesEvery: 2
};

let chart = Charter.createLineGraph(
	chartData,
	dependentAxisConfig,
	independentAxisConfig
);

return chart;

createScatterPlot

createScatterPlot(chartData, dependentAxisConfig, independentAxisConfig)

Creates a jQuery object of an element containiner a line graph, which can be inserted into the DOM.

chartData is a chartData object.

dependentAxisConfig is a numericAxisConfig object.

independentAxisConfig is a qualitativeAxisConfig object.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let years = rows.getCol(cols.YEAR);
let population = rows.getCol(cols.POPULATION);

let chartData = {
	title: 'Auckland city population over time',
	showLegend: true,
	labels: years,
	dataSeries: [
		{
			name: 'Population',
			color: '#f00',
			dataPoints: population
		},
		{
			name: 'Population linear fit',
			color: 'rgba(0, 0, 0, 0.5)',
			dataPoints: Stats.linearLeastSquares(population)
		}
	]
};

let dependentAxisConfig = {
	values: 5,
	roundTo: 10000,
	min: null
};

let independentAxisConfig = {
	valuesEvery: 2
};

let chart = Charter.createScatterPlot(
	chartData,
	dependentAxisConfig,
	independentAxisConfig
);

return chart;

updateBarChart

updateBarChart(chart, data, titleText)

Updates the values and optionally the title of an existing bar chart.

chart is an Element object.

data is an array of numbers, corresponding to the values of each bar in the chart being updated.

titleText is an optional string to replace the current title of the chart.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let cityNames = rows.getCol(cols.NAME);

let chartData = {
	title: 'City Populations',
	showTooltips: true,
	labels: cityNames,
	dataSeries: [
		{
			dataPoints: rows.getCol(cols.POPULATION)
		}
	]
};

let dependentAxisConfig = {
	roundTo: 100,
	values: 2
};

let chart = Charter.createBarChart(chartData, dependentAxisConfig);

$('#chart-area').append(chart);

window.setTimeout(() => Charter.updateBarChart(chart[0], [1, 2, 3, 4, 5, 6, 7, 8, 9], 'New title'), 1000);

Stats

Use

Stats contains a number of utility functions for common tasks when dealing with sets of numbers.

sum

sum(values)

Returns the sum of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let totalPopulation = Stats.sum(populations);

console.log(populations);
console.log(totalPopulation);

mean

mean(values)

Calculates the mean average of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let meanPopulation = Stats.mean(populations);

console.log(populations);
console.log(meanPopulation);

median

median(values)

Calculates the median average of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let medianPopulation = Stats.median(populations);

console.log(populations);
console.log(medianPopulation);

variance

variance(values)

Calculates the variance of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let populationVariance = Stats.variance(populations);

console.log(populations);
console.log(populationVariance);

sd

sd(values)

Calculates the standard deviation of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let populationSD = Stats.sd(populations);

console.log(populations);
console.log(populationSD);

max

max(values)

Calculates the maximum of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let populationMax = Stats.max(populations);

console.log(populations);
console.log(populationMax);

min

min(values)

Calculates the minimum of the set of numbers in the values array.

values is an array of numbers.

(city example.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let populations = rows.getCol(cols.POPULATION);
let min = Stats.min(populations);

console.log(populations);
console.log(min);

intRange

intRange(start, finish)

Creates an array of integers from a start value to a finish value.

start is a number. If it isn't an integer, it will be converted to one first via Math.round.

finish is a number. If it isn't an integer, it will be converted to one first via Math.round.

let intRange = Stats.intRange(3, 5);

console.log(intRange);

linearLeastSquares

linearLeastSquare(y, x)

Creates a linear regression fit for a set of data using the "least squares" method.

y is an array of values representing the data that will be fit.

x is an optional array of values representing the independent axis value for a given value in the y array. This is only necessary if the values in the y array are not evenly distributed across (e.g. one data point per year for a time series).

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let years = rows.getCol(cols.YEAR);
let population = rows.getCol(cols.POPULATION);

let linearFit = Stats.linearLeastSquares(population);

let chartData = {
	title: 'Auckland city population over time',
	showLegend: true,
	labels: years,
	dataSeries: [
		{
			name: 'Population',
			color: '#f00',
			dataPoints: population
		},
		{
			name: 'Population linear fit',
			color: 'rgba(255, 255, 255, 0.5)',
			dataPoints: linearFit
		}
	]
};

let dependentAxisConfig = {
	values: 5,
	roundTo: 1000000,
	min: null
};

let independentAxisConfig = {
	valuesEvery: 2
};

let chart = Charter.createLineGraph(
	chartData,
	dependentAxisConfig,
	independentAxisConfig
);

$('#chart-area').append(chart);

r

r(y, x)

Calculates the Pearson Correlation Coefficient between two sets of data of equal length.

y is an array of values.

x is an array of values of equal length to y.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let years = rows.getCol(cols.YEAR);
let population = rows.getCol(cols.POPULATION);

let linearFit = Stats.linearLeastSquares(population);

let r = Stats.r(linearFit, population);

console.log(r);

r2

r2(y, x)

Calculates the r² value of a regression model, equivalent to the square of the Pearson Correlation Coefficient calculated by r.

y is an array of values.

x is an array of values of equal length to y.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let years = rows.getCol(cols.YEAR);
let population = rows.getCol(cols.POPULATION);

let linearFit = Stats.linearLeastSquares(population);

let r2 = Stats.r2(linearFit, population);

console.log(r2);

smooth

smooth(y, smoothness)

Smooths an array of data by converting it into a rolling average using a number of points based on its smoothness argument. The output array will be shorter than the input y array by one less than the value of smoothness.

y is an array of values.

smoothness is an integer between 1 and the length of y (using a value of 1 will return the input y array).

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let years = rows.getCol(cols.YEAR);
let population = rows.getCol(cols.POPULATION);

let smoothness = 5;
let smoothed = Stats.smooth(population, smoothness);

let chartData = {
	title: 'Auckland city population over time',
	showLegend: true,
	labels: years.slice((smoothness-1)),
	dataSeries: [
		{
			name: 'Smoothed population',
			color: 'rgba(255, 255, 255, 0.5)',
			dataPoints: smoothed
		}
	]
};

let dependentAxisConfig = {
	values: 5,
	roundTo: 1000000,
	min: null
};

let independentAxisConfig = {
	valuesEvery: 2
};

let chart = Charter.createLineGraph(
	chartData,
	dependentAxisConfig,
	independentAxisConfig
);

$('#chart-area').append(chart);

chunk

chunk(y, chunkSize)

Converts an array of data into an array where each element is the sum of chunkSize elements of the input y array. This is useful, for example, for combining monthly data into yearly sums.

The input yarray should be evenly divisible by chunkSize, and the output array will by 1/chunkSize as long as the input array.

For example if chunkSize is 3, the output array will be one third as long as the input array, as each of its elements is the sum of 3 elements of the input array.

y is an array of values.

chunkSize is a positive integer.

(city example 3.csv)

let rows = dataConfig.rows;
let cols = dataConfig.cols;

let chunkSize = 5;
// Make sure rows.length will be divisible by chunkSize after removing one
while (((rows.length-1) % chunkSize) !== 0) {
	rows.shift();
}

let lastYearPopulation = rows.getCol(cols.POPULATION);

// Remove last "last year population" and first year's row, so the
// arrays line up as expected
lastYearPopulation.pop();
rows.shift();

let years = rows.getCol(cols.YEAR);

cols.POPULATION_PREVIOUS = rows.addCol(lastYearPopulation);
cols.POPULATION_INCREASE = rows.addDerivedCol((row) => (row[cols.POPULATION] - row[cols.POPULATION_PREVIOUS]));

let populationIncrease = rows.getCol(cols.POPULATION_INCREASE);
let chunkedPopulationIncrease = Stats.chunk(populationIncrease, chunkSize);

let yearSets = [];
for (let i = 0; i < years.length; i += chunkSize) {
	yearSets.push(years[i] + '-' + years[i+(chunkSize-1)]);
}

let chartData = {
	title: 'Auckland city population increase',
	showLegend: true,
	labels: yearSets,
	dataSeries: [
		{
			name: 'Population increase',
			dataPoints: chunkedPopulationIncrease
		}
	]
};

let dependentAxisConfig = {
	values: 5,
	min: 0
};

let independentAxisConfig = {};

let chart = Charter.createBarChart(
	chartData,
	dependentAxisConfig,
	independentAxisConfig
);

$('#chart-area').append(chart);