Concepts in brief
This chapter presents, in a rather concise way, all the concepts that are at the core of Squey. The understanding of each and every concept used by Squey is not mandatory to start using the software but readers of this manual are strongly encouraged to read this chapter before rushing on the application.
Should the reader be in a real hurry, each section of this chapter starts by a very short presentation of the corresponding concept. Our advice is that all readers of this manual should read all these small definitions, at least.
Moreover, the organisation of this chapter is more logical than alphabetical: concepts are presented in such a way that the newly defined concepts only rely on already defined ones. Of course, for some terms, links to terms that are defined later in this chapter are given.
But to ease the use of this chapter as a reference of concepts, we provide the alphabetically ordered list of links to concepts :
An Event is a well-defined block of information, commonly represented by a line in a file.
An Event is the basic unit of data that composes a whole set of Data. For example, if one considers a Log file of a web server, if we suppose that each line of this Log file has the same structure then each individual line of this file can be considered as an Event. + When data comes from an industrial process, an Event can be composed of the measurement (say, every minute) of a set of physical sensors, to monitor the state of an industrial plant.
Events are commonly stored as fixed-sized tuples (such as in a CSV file); however, some appliances or applications use a tree based block structure (such as XML) to store the information.
The term Input is a generic term used to name a dataset that can be processed by Squey. Most of the time, it is a plain-text file sitting on a local filesystem. But it might also be such a file accessed on a remote computer or the result of a query sent to a database.
A login name, a URL or a port number are other examples of classical Fields that are commonly found in Log files.
an URI can be further split into Scheme, Hostname, Port and Resources;
a Timestamp can be split into year, month, day, hours, minutes and seconds.
A Column is the natural representation of all the values taken by a given Field in a given Input, starting with the first Event in the first row, and ending by the last Event in the last row of the Column.
A Plotting is the second function that determines how values contained in Fields are plotted to a finite portion of a mathematical axis. The main purpose of a Plotting is to set the scale and range applied to the mathematical values computed by the previous Mapping operation.
An Axes combination is a very natural concept in Squey. After a given Input has been processed by Squey, a certain number of Fields are available and form a set of Columns/Axis. Out of this set, the user can select all the Fields it really needs for his or her investigation, and determine in which order these Fields should be organized.
their type: ‘IP’, ‘integer’, ‘string’, and so on;
their displayed name: ‘request’, ‘response’, ‘error code’, and so on;
their associated Mapping;
their associated Plotting;
and a few more parameters.
As said earlier, a Format describes how Events are extracted from an Input and split in Fields. Sometimes, it happens that the extraction and splitting process described by a given Format fails for some Events.
This can happen for different reasons:
Investigations are what Squey users will use to save their work and be able to resume their analysis later. It allows them to save:
An event is ‘selectable’ if it belongs to the current layer’s event set.
A zone is a pair of Axes or of Columns.
A Data tree is a tree based hierarchical representation of all existing Views in an Investigation. It also provides the list of Data collections, Sources, Mappings, and Plottings of the current Investigation.