Leliel Tutorial (Returns Management)

In this tutorial we are going to create a classifier system that is going to predict if a product will be returned or not based on products bought previously. If you want to follow the tutorial step by step you can download the example data file in this link.

Angel Description

A classifier is a machine learning algorithm that receives an object and attaches “tags” or “labels” to it. Leliel is a classifier built on top of the Ramiel similarity engine and provides unmatched performance and accountability features.


File Specification:

ID A mandatory field which uniquely identifies each object.
CLASS A mandatory field which specifies the field to be classified.
REAL Numerical values.
NOMINAL Values that do not bear a quantitative relationship with each other (i.e., strings and numbers which represent non-numerical information).
MULTI_PLAIN Multiple NOMINAL values separated by spaces. Non-language specific.
MULTI_ENGLISH Multiple NOMINAL values separated by spaces. The text is English language.
MULTI_SPANISH Multiple NOMINAL values separated by spaces. The text is Spanish language.
MULTI_JAPANESE Multiple NOMINAL values separated by spaces. The text is Japanese language.
ITEM_SET A series of values with weights. (Formatted as item1:weight1;item2:weight2;item3:weight3)
IGNORE The column shall be ignored by the program.
META This column is for metadata and shall be ignored by the program, but information will be retained in the output.

 

The file should consider these specifications:

  • The file should be in tsv format, which is a tab separated values = ‘\t’
  • We expect to have a header with the same column names that were specified.
  • We are going to read all files in the folder, all of them should follow this format.
  • TSV Quote Character = ‘ ” ‘
  • TSV Line End = ‘\n’
  • TSV Escape Character= ‘\’

Example:

This file contains data about products shopping. The table shows the first fifteen lines of the file.

orderItemID orderDate deliveryDate itemID size color manufacturerID price customerID salutation dateOfBirth state creationDate returnShipment
1 2012-04-01 2012-04-03 186 m denim 25 69.90 794 Mrs 1965-01-06 Baden-Wuerttemberg 2011-04-25 0
2 2012-04-01 2012-04-03 71 9+ ocher 21 69.95 794 Mrs 1965-01-06 Baden-Wuerttemberg 2011-04-25 1
3 2012-04-01 2012-04-03 71 9+ curry 21 69.95 794 Mrs 1965-01-06 Baden-Wuerttemberg 2011-04-25 1
4 2012-04-02 ? 22 m green 14 39.90 808 Mrs 1959-11-09 Saxony 2012-01-04 0
5 2012-04-02 1990-12-31 151 39 black 53 29.90 825 Mrs 1964-07-11 Rhineland-Palatinate 2011-02-16 0
6 2012-04-02 1990-12-31 598 xxl brown 87 89.90 825 Mrs 1964-07-11 Rhineland-Palatinate 2011-02-16 0
7 2012-04-02 1990-12-31 15 39 black 1 129.90 825 Mrs 1964-07-11 Rhineland-Palatinate 2011-02-16 0
8 2012-04-02 2012-04-03 32 xxl brown 3 21.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
9 2012-04-02 2012-04-03 32 xxl red 3 21.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
10 2012-04-02 2012-04-03 57 xxl green 3 39.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
11 2012-04-02 2012-04-03 2 xxl mocca 2 39.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
12 2012-04-02 2012-04-03 259 39 black 1 119.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
13 2012-04-02 2012-04-03 603 39 black 55 169.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
14 2012-04-02 2012-04-10 259 39 ocher 1 119.90 850 Mrs 1948-04-08 North Rhine-Westphalia 2011-02-16 1
15 2012-04-02 2012-04-03 165 37 mocca 47 89.90 858 Mrs ? Berlin 2012-03-29 1

 

This file has fourteen specific columns. The first column is the order ID. The next seven columns specify information about the product that has been bought. The next five columns show information about the customer and the last column specify if the product bought was returned or not, where 1 means “yes” and 0 means “no”.

The purpose of this tutorial is to predict if a product will be returned or not.


Columns Specs:

Our cloud shows the next specs columns and types, but this is a recommendation, you can choose a different types for each column depending on your interests:

orderItemID ID
orderDate IGNORE
deliveryDate IGNORE
itemID NOMINAL
size NOMINAL
color NOMINAL
manufacturerID NOMINAL
price REAL
customerID NOMINAL
salutation NOMINAL
dateOfBirth IGNORE
state NOMINAL
creationDate IGNORE
returnShipment CLASS

Angel Parameters Specification:

These are the parameters needed for the angel creation:

Storage Units Specify the angel unit size reserved for creation.
Parallelism Specify the number of replications for the angel that you want to create.
Ramiel K Specify the number of results for the nearest neighbor search.
Pivots The number of primary search points in the engine.
Probability Minimum accepted probability for the results, any result with lower probability will be discarded.
Accepted Error Accepted search error from the distance calculated by the engine and the real distance.

  • Create Folder

    • Click on “Create Folder” to create a container for your csv, tsv or json files that our similarity engine will search.

    • Provide the folder name and click on “Create Folder”.

    • Once the folder is created you will return to a folder list view.

  • Upload File(s)

    • In the “Folder” that you created click on “Upload File” to see the next modal.

    • After choosing your files click on “Upload File”.

    • You can see the progress bar while the file is being uploaded.

    • Once the files are uploaded you will return to a folder list view.

  • Create your Angel

    • Go to “Create Angel” section to choose the angel that works for your project.

    • For this example we are going to create a Leliel. Click “Create” on Leliel image.

    • The next step is choosing the folder containing the files that you want to use to train the angel. When you choose the folder you can see a preview of the files.

    • If you want to change the type of any column you can do it by choosing an option in the list of types. For Leliel is required that exists a column with type ID and a column with type CLASS.

      Then click on “Next” to continue the creation.

    • The next step is to fill the Leliel parameters (default parameters will work fine) and to choose the name for your angel.

      Click on “Create” to start the creation of the angel.

    • Once the creation started you can see a table with your current angels and the progress of the creation. When the state is running the angel is ready to answer queries.

  • Query Your Angel

    • For the query you have two options, Execute Query and Batch Query, both options can be accessed from “Your Angels” screen.

    • Execute Query

      Provide the values for the object that you want to query and then click on “Execute Query”.

    • Other option is to choose a folder containing the files that you want to use to query the angel.

    • Then click on “Fill Query Fields” to choose a row from the file to fill the query object. In this example we fill the fields with the fifth row showed in the preview of the data.

    • Now you only need to click on “Execute Query” to obtain your result. On the query results, the first value is the class, the second value is a score (higher is better), the third value is the confidence of the result and the last value specify the probability that the class is the correct between the results (the query can return more than one class).

    • Batch Query

      First choose a folder containing the files that you want to use to query the angel, and click on “Execute Query”, this is going to create a batch process.

    • Once the batch query has started you can see a table with your current batch files and the progress of the execution. When the state is completed the batch is ready for download.

Manfred CalvoLeliel Tutorial Returns Management