change task type prompt to prevent unwanted label transformation

This commit is contained in:
Yizhou Chi 2024-09-14 17:34:17 +08:00
parent 8a5b6d6e77
commit 743c67aef8

View file

@ -11,7 +11,7 @@ The current task is about data preprocessing, please note the following:
- Monitor data types per column, applying appropriate methods.
- Ensure operations are on existing dataset columns.
- Avoid writing processed data to files.
- Avoid any change to label column, such as standardization, etc.
- **ATTENTION** Do NOT make any changes to the label column, such as standardization, etc.
- Prefer alternatives to one-hot encoding for categorical data.
- Only encode or scale necessary columns to allow for potential feature-specific engineering tasks (like time_extract, binning, extraction, etc.) later.
- Each step do data preprocessing to train, must do same for test separately at the same time.
@ -26,7 +26,7 @@ The current task is about feature engineering. when performing it, please adhere
- Avoid creating redundant or excessively numerous features in one step.
- Exclude ID columns from feature generation and remove them.
- Each feature engineering operation performed on the train set must also applies to the dev/test separately at the same time.
- Avoid using the label column to create features, except for cat encoding.
- **ATTENTION** Do NOT use the label column to create features or make any changes to the label column, except for cat encoding.
- Use the data from previous task result if exist, do not mock or reload data yourself.
- Always copy the DataFrame before processing it and use the copy to process.
"""