Expired
Milestone
Feb 3, 2025–Apr 30, 2025
Development of Data Import and Workflow Training UI
Plan, Implement, and Validate an UI in tikiwiki for Data Import and Workflow training, using the confidence attribution workflows as first study case to validate implementation.
Use Case 1: Data Import and Analysis for Model Training
Actors:
- User (Data Analyst, Data Scientist, Software Engineer)
- Data Import and Processing System
Objective:
Enable users to import data for analysis through workflows, select relevant variables, and train reliable predictors.
Preconditions:
- The system must expose an API that supports PUT requests with a flag indicating it is an import.
- The import process must be asynchronous and processable in the background.
- Imported data must contain appropriate timestamps for identifying past events.
Main Flow:
- The user initiates a data import process via API.
- The system processes the import asynchronously and stores the data.
- The system calculates correlations between imported series and other existing series in the domain using Pearson's coefficient (or another method).
- Upon completion, the system notifies the user via email, providing a link to continue workflow configuration.
- The user accesses the interface and views a form with select boxes to choose series for analysis.
- The system displays the series list sorted by correlation degree.
- The user can mark series as synonyms and exclude them from the training set.
- The system enforces restrictions for selecting series for prediction (e.g., at least 2 series with a minimum correlation of x and a maximum of y, e.g., min 2, max 5).
- The user selects the series, network layers, and neurons, then starts predictor training.
- The system samples the imported series, allowing specific period selection or random sampling.
- The model is trained and validated.
- The result (model error) is presented to the user.
- The user decides whether to continue the process, applying confidence to imported and new data.
Alternative Flows:
- (4a) If the import fails, the system informs the user of the error.
- (6a) If a model for this data type already exists, the user is notified of its generation time and error.
- (7a) If a series lacks sufficient correlation, the system disregards it automatically.
- (13a) If the user does not continue, the model is deleted. The user can restart from step 5.
- (13b) If the user continues, the model is saved, and an input workflow is created to apply confidence when new data of the same type (domain, unit, dev) is inserted. The user can return to step 5 to create models for other data types or replace the existing one.
Postconditions:
- The predictive model is trained and ready for use.
- Imported data has been processed and validated.
- Data confidence has been updated.
- The system can apply confidence to future data based on the trained model.
Use Case 2: Data Analysis for Model Training
Actors:
- User (Data Analyst, Data Scientist, Software Engineer)
- Data Import and Processing System
Objective:
- Enable users to select data series on the platform for analysis through workflows, choose relevant series, and train predictors.
Preconditions:
- The system must expose an API that supports SEARCH requests with a flag indicating model training.
- The process must be asynchronous and processable in the background.
Main Flow:
- The user initiates a model training process via API, specifying the list of series for analysis.
- The system processes the request asynchronously.
- The system calculates correlations between data series using Pearson's coefficient (or another method).
- Upon completion, the system notifies the user via email, providing a link to continue workflow configuration.
- The user accesses the interface and views a form with select boxes to choose analysis series.
- The system displays the series list sorted by correlation degree.
- The user can mark series as synonyms or exclude them from the training set.
- The system enforces restrictions for selecting series for prediction (e.g., at least 2 series with a minimum correlation of x and a maximum of y, e.g., min 2, max 5).
- The user selects the series, network layers, and neurons, then starts predictor training.
- The system samples the imported series, allowing specific period selection or random sampling.
- The model is trained and validated.
- The result (model error) is presented to the user.
- The user decides whether to continue, applying confidence to existing and new data on the platform.
Alternative Flows:
- (5a) If a confidence model for any data already exists, the interface informs the user about the model, its input series, and error rate.
- (7a) If a series lacks sufficient correlation, the system disregards it automatically.
- (13a) If a confidence model already exists, the interface informs the user of its input series and error rate, allowing the user to choose the new model if its error is lower.
Postconditions:
- The predictive model is trained and ready for use.
- Data confidence has been updated.
- The system can apply confidence to future data based on the trained model.
Loading
Loading
Loading
Loading