Part Classification with Machine Learning

Predicting Part Classification
Categorizing Parts into New Categories
Listing Categories
Resetting Categories
User Training Data
Volume Selection by Category

In a complex assembly containing tens or hundreds of volumes, it is sometimes useful to classify or identify volumes according to a predefined category. Cubit uses machine learning methods to categorize volumes into one of a predefined set of categories. Custom categories, defined by the user, can also be set up used for classification. The classify command is primarily used to identify and control part classification using the Cubit command line. In addition, the machine learning tools encapsulated in the geometry power tool in Cubit's graphical user interface provides an extensive set of tools for classifying and applying geometric solutions.


 classify {volume <ids>} [confidence] [features [importance]]

 classify {volume <ids>} "<string>" [export_acis]

 classify train

 classify list

 classify reset ["<string>"]

 classify user path ["<string>"]

 reclassify {volume <ids>} "<string>"

Predefined Fixed Part Classification Categories

Cubit currently provides the following set of predefined categories for classification:

  1. bolt
  2. spring
  3. washer
  4. nut
  5. insert
  6. pin
  7. ball
  8. race
  9. gear
  10. thin
  11. other


Part classification uses machine learning methods to identify parts based on characteristic geometric features of the volume. These methods utilize a fixed set of training data which has been provided with your Cubit installation. This training data may be augmented by the user to refine the existing categories or add new categories. When using any of the above classify commands for the first time, a short pause may occur while the fixed training data from the Cubit installation and the user training data are loaded into the program.

Predicting Part Classification

The command classify {volume <ids>} is used to identify a volume based upon the set of categories from both fixed and user training data. When executed, various geometric features of the specified volume(s) are computed and a prediction made as to the most likely categorization based upon the existing training data. Depending upon the number of volumes to classify and the complexity of the volumes, this command may take a few seconds to complete. The result will be a string printed to the output window of the predicted category, such as:

 Volume 1 (Solid) is "spring"

The optional confidence argument may be used to list the confidence values for each of the existing categories for both fixed and user defined. The confidence is a numerical value from 0 to 1 computed by the machine learning method indicating the confidence level of the classification for each category. The highest numerical value of confidence is selected as the predicted category.

The features and importance arguments are primarily used for diagnostics to examine the resulting geometric features and their relative significance to the prediction. The features are a list of about 50 scalar values that are computed for each volume that are used in the machine learning methods and the importance is the relative weight given to each feature when making the prediction.


Categorizing Parts into New Categories

If the existing set of fixed categories is insufficient, users can create new categories to augment the fixed categories. In addition, if a predicted category for a given volume is incorrect or insufficient, training data for the fixed categories may also be added. To add training data to a category, use the classify {volume <ids>} "<string>" command, where <string> is either an existing category name or a new category. Use quotation marks to distinguish the category name. For example:

 classify volume 1 "spring"

When executed, the features of the given volume will be computed and written to disk as a new set of training data. User training data defined in this way is written to an application directory which is persistent so it can be used in subsequent runs of Cubit.

In most cases the more examples of a category that can be provided, the more accurate will be subsequent predictions. For example, if a single volume of new category "widget" is added, all volumes that are identical to the initial example will most likely be predicted as "widget", however small variations of the volume may not. For this reason, providing as many varying examples of a "widget" as possible is advantageous when setting up a new category. Providing identical examples of the same volume to the classify command is also not detrimental, as duplicates are automatically identified and filtered out. Note that when categorizing a new volume, the output window will indicate whether new training data was added or whether the volume represents duplicate data.

The export_acis option can be used to automatically write an acis .sat file of the specified volume(s). If used, the file(s) will be written to the user training directory titled with the name of the category. Each acis file will contain a single volume and named using the category name and a unique incremented integer ID. While primarily used for debugging, they can also be used as a visual representation of the current user categories.


Whenever new training data is added, the classification models must be retrained. Although retraining happens automatically any time training data is added or deleted when ussing Cubit commands, the train command is useful for forcing a retrain should additional training data be added manually to the training directories on disk. In addition to rerunning the training process, it will print to the output window the number of supporting volumes used for each category when computing classification predictions.


Listing Categories

All existing classification categories may be printed to the output window using the classify list command. For user defined categories or fixed categories that have been augmented, a (U) will follow the name of the category.

In addition to the category names, the classify list command will also print the path to both the fixed training data and the user training data on disk.

Resetting Categories

To remove an existing user defined category use the command classify reset ["<string>"] where <string> is a a category name in quotations. This will remove all user training data for the given category from disk and retrain the classification models without the category. If a fixed category is specified, the user-defined training data defined in the category will be removed, keeping only the fixed data. If this command is used without a category, all user training data will be removed.


User Training Data

When a new category is created or a fixed category is added to, the resulting training data is written to a default application directory that is specific to a platform. For example, for Mac and Linux OS the directory will be located at:

 /Users/<user_name>/Library/Application Support/Cubit/ml

To display both the current user training data directory and the fixed training data directory, use the classify list command. While the fixed training data directory canot be changed, it may be worthwhile to change the user training directory. To change the user training directory, use the command classify user path "<path string>" where <path string> is the full path to a writable directory on disk in quotes. Changing the user training directory may be useful to temporarily use training data from another source or to ignore all user training data without removing it.

Using the command, classify user path without a path specification will set the user training data directory back to its default for the platform.


At times, it may be necessary to remove a volume from its current category classification, or move the volume from one category to another. Use the reclassify {volume <ids>} "<string>" command to update the training data for one or more volumes where <string> is the name of a category. If the training data for the given volume is currently present in another category, it will be removed from its current category and added to the new specified category. If no category is specified, it will be removed from its current category without adding it to another category. Training data from the fixed training data cannot be removed or modified.


Volume Selection by Category

It may be useful to identify all volumes based on their predicted classification category. To do so, use the syntax with category "<string>" in conjunction with other Cubit commands that accept IDs. A few example uses this syntax might include:

 group "mysprings" add volume with category "spring"

 draw volume with category "bolt"

 delete volume with category "insert"

In the first example, a group named mysprings is created and all volumes with the predicted category of spring added to it. The second example would draw all volumes that are classified as bolt, and the third example would delete all volumes that are classified as insert.

Note that in order to identify volumes based on their predicted category, features for all volumes in the model will be computed. For assemblies with hundreds or thousands of parts, this may be time consuming. Also note that a similar capability is available in the geometry power tool machine learning tools which will list the predicted category for each volume in the model in separate drop-down lists and build cubit groups from each category if requested.