Ramiel is a similarity search engine that provides a type of query that is not available in traditional databases: it can search for the closest k elements given an arbitrary similarity criteria. Our technology is the fastest available. Ramiel provides outstanding performance and ease of use. Our engine is a natural choice for those who need to handle large datasets, the goldmines of our time.
|NOMINAL||String values or what you would consider an ENUM.|
|MULTI_ENGLISH||Free form text. The text is english language.|
|MULTI_SPANISH||Multiple NOMINAL values separated by space. The text is spanish language.|
|IGNORE||The column shall be ignored by the program.|
|ID||Your internal object ID, useful when returning results.|
|META||You can store arbitrary information in a string. Currently the meta data is a reserved
The file should consider these specifications:
- The file should be in tsv format, which is a tab separated values = ‘\t’
- We expect to have a header with the same column names that were specified.
- We are going to read all files in the folder, all of them should follow this format.
- TSV Quote Character = ‘ ” ‘
- TSV Line End = ‘\n’
- TSV Escape Character= ‘\’
This file contains data about people information, such as age, occupation, among others. The table shows the first fifteen lines of the file.
|94862||54||Bachelors||Sales||Cape Gooseberry Pepino Wax jambu Water Apple Dead Man’s Fingers Indian prune Damson plum|
|80793||75||1st-4th||Machine-op-inspct||Pewa Duku Imbe Yellow Granadilla Voavanga Mamoncillo Barbados Cherry Atemoya Indian almond|
|75027||21||11th||Handlers-cleaners||Black Mulberry Mamey Yellow Mombin Vanilla Pulasan Jaboticaba Huito Maypop Wax jambu|
|55847||16||Masters||Adm-clerical||Agave Canistel Malabar plum Rough Shell Macadamia Ceylon gooseberry Strawberry Pear|
|69035||33||10th||Protective-serv||Naranjilla Genip Durian Noni Mangosteen Pupunha|
|74878||67||Assoc-voc||Machine-op-inspct||Sapodilla Acerola Wood Apple African cherry orange Guava Nance Sea Grape Huito|
|28219||68||Doctorate||Craft-repair||Jelly Plum Bilimbi Madrono CamuCamu Damson plum Nagami Kumquat Kiwifruit Pecan Tangerine|
|41853||58||Doctorate||Priv-house-serv||Key lime Avocado Tangerine Indian gooseberry Summer squash Melon pear Coconut Tangerine Betel Nut|
|40302||89||5th-6th||Adm-clerical||Huito Pepino Guanabana Chenet Youngberry Mango Pupunha|
|64184||77||Doctorate||Transport-moving||Lucuma Oil Palm White Sapote Ilama Biribi|
|63402||84||Bachelors||Armed-Forces||Sweet Granadilla Soursop Canistel Mountain Soursop Winged Bean Bignay Kei apple Grumichama Kepel fruit|
|60921||81||Prof-school||Craft-repair||Jelly Plum Soursop Maypop Carambola Barbadine Purple Mombin|
|60388||69||1st-4th||Transport-moving||Bignay Ice Cream Bean Chayote Chempedak Rose Apple Surinam Cherry Abiu Hairless rambutan Avocado Kepel fruit|
|53179||59||10th||Craft-repair||Kwai Muk Breadnut Pummelo Vanilla Agave Otaheite gooseberry Spanish lime Chupa-Chupa|
|88517||42||Some-college||Exec-managerial||Nagami Kumquat Indian jujube Watermelon Purple Guava Salak Rough Shell Macadamia Capulin Cherry Yellow Granadilla Grapes|
This file has five specific columns. The first column is the person’s ID. The second column is the person’s age. The third column specify the education grade of the person. The forth column is the occupation and finally the fifth column names some fruits that the person likes.
The purpose of this tutorial is to find people with similar interests and characteristics.
Our cloud shows the next specs colums and types, but this is a recommendation, you can choose a different types for each column depending on your interests:
Angel Parameters Specification:
These are the parameters needed for the angel creation:
|Storage Units||Specify the angel unit size reserved for creation.|
|Parallelism||Specify the number of replications for the angel that you want to create.|
|Ramiel K||Specify the number of results for the nearest neighbor search.|
|Pivots||The number of primary search points in the engine.|
|Probability||Minimum accepted probability for the results, any result with lower probability will be discarded.|
|Accepted Error||Accepted search error from the distance calculated by the engine and the real distance.|