Requirement |
Description |
Dataset loading |
The system shall be able to load a CSV dataset containing email texts and their labels. |
Preprocessing text data |
The system shall convert email text into numerical features using NLP techniques. |
Training of classification models |
The system shall allow training of multiple machine learning models. |
Evaluation of models |
The system shall generate performance metrics after training. |
Classification of new emails |
The system shall predict the category of a new, unseen email provided by the user. |
Model selection by user |
The system shall allow the user to choose which classifier to run for inference. |
User interface |
The system shall expose a simple interface (CLI or API) where users can train models, test predictions, and select classifiers. |
Requirement |
Description |
Language |
Application interface shall be created in English language. |
Performance |
The system shall classify a single email in less than 1 second on a standard laptop. |
Usability |
The interface shall be intuitive and easy to navigate. |
Portability |
The application shall run on any system with Python 3.9+ and the required dependencies installed. |
Maintainability |
The code shall follow an object-oriented and modular design to allow easy extension. |
Reliability |
The system shall provide consistent classification results for the same input and model. |
Transparency |
The project shall include clear documentation to ensure reproducibility. |
Scalability |
The design shall allow integration with external APIs or inbox parsers in the future without major refactoring. |