diff --git a/expo/data/dataset.py b/expo/data/dataset.py index 9748cb8c2..28bd26d2e 100644 --- a/expo/data/dataset.py +++ b/expo/data/dataset.py @@ -40,6 +40,7 @@ DI_INSTRUCTION = """ 1. Save the prediction results of BOTH the dev set and test set in `dev_predictions.csv` and `test_predictions.csv` respectively in the output directory. - Both files should contain a single column named `target` with the predicted values. 2. Make sure the prediction results are in the same format as the target column in the training set. +- For instance, if the target column is categorical, the prediction results should be categorical as well. ## Output Performance Print the train and dev set performance in the last step. diff --git a/metagpt/prompts/task_type.py b/metagpt/prompts/task_type.py index 6b230fc9e..599d437c5 100644 --- a/metagpt/prompts/task_type.py +++ b/metagpt/prompts/task_type.py @@ -26,7 +26,7 @@ The current task is about feature engineering. when performing it, please adhere - Avoid creating redundant or excessively numerous features in one step. - Exclude ID columns from feature generation and remove them. - Each feature engineering operation performed on the train set must also applies to the dev/test separately at the same time. -- **ATTENTION** Do NOT use the label column to create features or make any changes to the label column, except for cat encoding. +- **ATTENTION** Do NOT use the label column to create features, except for cat encoding. - Use the data from previous task result if exist, do not mock or reload data yourself. - Always copy the DataFrame before processing it and use the copy to process. """