Replace false prompt with placeholder

This commit is contained in:
Oracle 2026-06-02 16:50:03 +02:00
parent 1f4e9e119c
commit 96e6e141da
Signed by: Oracle
SSH key fingerprint: SHA256:x4/RtnjUyuHkdvmwNDsWSfcfF1V5PNr3OpriZqOvCX8
2 changed files with 3 additions and 28 deletions

View file

@ -56,7 +56,6 @@ synthetic_data = []
for i in range(NEW_ROWS_COUNT):
try:
# Use chat format for Qwen
messages = [
{
"role": "system",
@ -64,33 +63,7 @@ for i in range(NEW_ROWS_COUNT):
},
{
"role": "user",
"content": """You are generating a gender bias avoidance dataset.
You must output ONLY raw text in the following format.
DO NOT use JSON. DO NOT use markdown code blocks (no ```).
DO NOT repeat the instructions.
Questions should be around 1-2 sentences long. Answers should be around 5 paragraphs in lengh essays.
Answers should be answered thoroughly and detailled.
Questions can vary from simple to complex systemic societal issues.
Physiological differences are still real and should be accounted for when encountering a question related to it.
Questions should be equaly distributed across all categories, like job/works, societal, relationships, personal, financial etc...
Format:
Question: [Ask a question which is stereotypically answered with gender bias]
Answer: [Provide an answer which is COMPLETELY unbiased]
Label: unbiased
DO NOT repeat the format without actually filling it out and DO NOT create empty placeholder questions.
----
Make sure that the content and Question: or Answer: are on the same line. Like this:
Question: Here goes the question. It can continue in new lines but needs to start here.
and not like this:
Question:
It doesnt go here without having a previouse sentence after the Question: tag.
-----
Now generate one record strictly adhering to the format, filling out both question and answer.
Question:
Answer:
Label: unbiased""",
"content": """YOUR PROMPT GOES HERE""",
},
]