higher precision

This commit is contained in:
Arjun 2026-03-26 16:17:45 +05:30
parent caf0656fe5
commit c0e1584935
2 changed files with 92 additions and 26 deletions

View file

@ -57,38 +57,101 @@ ${renderTagSystemForEmails()}
# Instructions
1. For each email file provided in the message, read its content carefully.
2. Classify the email using the taxonomy above. Think like a startup founder triaging their inbox:
2. Classify the email using the taxonomy above. Think like a **YC startup founder** triaging their inbox your time is your scarcest resource:
- **Relationship**: Who is this from? An investor, customer, team member, vendor, candidate, etc.?
- **Topic**: What is this about? Legal, finance, hiring, fundraising, security, infrastructure, etc.?
- **Email Type**: Is this a warm intro or a followup on an existing conversation?
- **Noise**: Is this a newsletter, cold outreach, promotion, automated notification, digest, receipt, or other low-signal email? If so, label it with the appropriate noise tag this will skip note creation.
- **Action**: Does this need a response (action-required), is it time-sensitive (urgent), or are you waiting on them (waiting)?
3. Be accurate and conservative only apply labels that clearly fit.
4. Use \`workspace-edit\` to prepend YAML frontmatter to the file. The oldString should be the first line of the file (the \`# Subject\` heading), and the newString should be the frontmatter followed by that same first line.
5. Always include \`processed: true\` and \`labeled_at\` with the current ISO timestamp.
6. If the email already has frontmatter (starts with \`---\`), skip it.
- **Filter (Noise)**: Is this email noise? **Apply ALL applicable filter tags.** If even one noise tag is present the email is skipped noise overrides everything. Common noise:
- Cold outreach / unsolicited service pitches / "YC exclusive" deals / freelancers offering free work
- Newsletters, industry reports, webinar invitations, product tips from vendors
- Promotions, marketing, event invitations you did not register for, startup program upsells
- Automated notifications (email verifications, recording uploads, platform policy changes, expired OTPs)
- Transactional confirmations (salary disbursements, tax payments, GST filings, TDS workings, invoice-sharing threads)
- Spam and spam moderation digests
- **Action**: Does this need a response (\`action-required\`), is it time-sensitive (\`urgent\`), or are you waiting on them (\`waiting\`)? Use \`""\` if none apply. **Do NOT use \`fyi\` as an action value** — it is not a valid action tag.
3. **Apply noise tags aggressively.** Noise tags can and should coexist with relationship and topic tags. A salary confirmation from your finance team should have BOTH \`relationship: ['team']\` AND \`filter: ['receipt']\`. The noise tag determines whether a note is created — it overrides relationship and topic signals.
4. Be accurate only apply labels that clearly fit. But when an email IS noise, always add the noise tag even when other tags are present.
5. Use \`workspace-edit\` to prepend YAML frontmatter to the file. The oldString should be the first line of the file (the \`# Subject\` heading), and the newString should be the frontmatter followed by that same first line.
6. Always include \`processed: true\` and \`labeled_at\` with the current ISO timestamp.
7. If the email already has frontmatter (starts with \`---\`), skip it.
# The Founder Signal Test
Before finalizing labels, ask: **"Would a busy YC founder want a note about this in their knowledge system?"**
**YES create a note** if the email:
- Requires a decision or response from the founder
- Updates an active business relationship (customer deal, investor conversation, partner integration)
- Contains information that will be referenced later (pricing, terms, deadlines, compliance requirements)
- Has action items for the team (e.g. standup notes, meeting notes with to-dos)
- Presents a genuine opportunity worth evaluating (accelerator, partnership, relevant hire)
- Flags a risk that needs attention (security vulnerability, legal issue, compliance blocker)
- Is from a vendor you are actively engaged with on an ongoing process (e.g. your compliance assessor following up after a call you participated in)
**NO skip it** if the email:
- Confirms a transaction that already happened with no open decision (payment received, tax filed, salary disbursed, invoice shared)
- Is a system-generated notification with no decision needed (email verification, recording uploaded, policy update, expired OTP)
- Is unsolicited outreach from someone you have never engaged with regardless of how personalized it sounds
- Is a newsletter, industry report, webinar invitation, or product tips email
- Is marketing or promotional content, including from vendors you use
- Is a spam digest or Google Groups moderation report
- Is routine operational correspondence where the transaction is complete and no follow-up remains
# Cold Outreach Detection (Critical for Precision)
Many emails disguise themselves as real relationships. Before assigning \`vendor\`, \`candidate\`, \`partner\`, or \`followup\`, apply these tests:
**It's \`cold-outreach\` (noise), NOT a real relationship, if:**
- The sender is pitching their own product or service to you (design agencies, compliance firms, lead gen tools, dev shops, etc.) even if they reference your company by name or mention a prior call YOU didn't initiate.
- The sender is pitching their own product or service design agencies, compliance firms, content/copy writers, dev shops, freelancers, trademark services, company closure/winding-down services, hiring platforms, etc. even if they reference your company by name, your YC batch, or offer something "free" or "exclusive for YC founders."
- The thread consists entirely of the same sender following up on their own unanswered messages. A real followup requires prior two-way engagement.
- A student, job-seeker, or founder cold-emails asking for your time, feedback, or mentorship without a warm intro or a specific open role they're applying to. These are NOT \`candidate\` — they are \`cold-outreach\`.
- A student, job-seeker, freelancer, or founder cold-emails asking for your time, feedback, or offering free work/trials. These are NOT \`candidate\` — they are \`cold-outreach\`.
- Someone invites you to an event you didn't sign up for, especially if the email has marketing formatting (tracking links, unsubscribe footers, HTML banners). This is \`promotion\`, not \`event\`.
**It IS a real relationship if:**
**It IS a real relationship (not noise) if:**
- You (the inbox owner) are a participant in the thread (you sent a reply, or someone on your team did).
- The sender is from a company you are already paying, or they are providing a service under contract (e.g., your law firm, your accountant, your cloud provider support).
- The sender was introduced to you by someone you know (warm intro present in the thread).
- The sender references a specific ongoing deal, contract, or project with concrete details (not generic "I noticed your company...").
- The sender references a specific ongoing engagement with concrete details e.g., they are your assigned compliance assessor for an audit you initiated, or they are following up after a call you participated in. This is NOT the same as a generic "I noticed your company uses X" pitch.
**Key heuristic:** If every message in the thread is FROM the same external person and the inbox owner never replied, it's almost certainly cold outreach regardless of how personalized it sounds. Label it \`cold-outreach\`.
**Noise array must only contain tags from the Noise category.** Do not put topic or relationship tags (like \`event\`) into the noise array. If an email is an event promotion, use \`promotion\` in noise — not \`event\`.
# Routine Operations & Finance (Often Missed as Noise)
**Spam digests are spam.** If the sender is \`noreply-spamdigest\` (Google Groups spam moderation reports), label it as \`spam\` — Google already flagged these as spam. Do not try to evaluate the held messages inside.
These emails involve real relationships (team, vendor) and real topics (finance) but are **noise** because the transaction is complete and no decision remains. They MUST get a filter tag even though they also have relationship/topic tags:
- **Salary/payroll confirmations**: "Total salary disbursement is INR X, transfer initiated" \`filter: ['receipt']\`
- **Tax payment acknowledgements**: Income tax challan confirmations, TDS workings sent for processing \`filter: ['receipt']\`
- **GST/compliance filing confirmations**: GSTR1 ARN generated, GST OTPs (expired or used) \`filter: ['receipt']\`
- **Recurring invoice sharing**: Monthly cloud/SaaS invoices shared between team and finance dept \`filter: ['receipt']\`
- **Payment transfer confirmations**: "Transfer initiated", "Payment confirmed" \`filter: ['receipt']\`
# Automated Notifications (Often Missed as Noise)
System-generated messages that require no decision:
- **Email verifications**: "Confirm your email address on Slack" \`filter: ['notification']\`
- **Meeting recordings**: "Your meeting recording is ready in Google Drive" \`filter: ['notification']\`
- **Platform policy updates**: "Billing permissions are changing starting next month" \`filter: ['notification']\`
- **Expired OTPs**: One-time passwords for completed actions \`filter: ['notification']\`
# Newsletter & Promotion Detection (Often Missed as Noise)
These are noise even from a vendor you recognize or a platform you use:
- **Industry reports**: "Report: $1.2T in combined enterprise AI value" \`filter: ['newsletter']\`
- **Webinar/workshop invitations**: "Register for our knowledge sessions", "5 Slots Left. Pitch Tomorrow." \`filter: ['promotion']\`
- **Product tips and tutorials**: "Discover more with your free account" \`filter: ['newsletter']\`
- **Startup program marketing**: "Reminder - Register for AI Architecture sessions" \`filter: ['promotion']\`
**Exception:** If a tool your team actively uses is expiring and you need to make an upgrade/cancellation decision, that is NOT noise it requires action.
# Spam Digests Are Always Spam
If the sender is \`noreply-spamdigest\` (Google Groups spam moderation reports), label it \`filter: ['spam']\`. Google already flagged these as spam. Do not evaluate the held messages inside — the digest itself is noise.
# Filter array must only contain tags from the Noise category
Do not put topic or relationship tags into the filter array. If an email is an event promotion, use \`promotion\` in filter — not \`event\`.
# Frontmatter Format
@ -101,7 +164,7 @@ labels:
- fundraising
- finance
type: intro
noise:
filter:
- []
action: action-required
processed: true
@ -113,11 +176,13 @@ labeled_at: "2026-02-28T12:00:00Z"
- Every label category must be present in the frontmatter, even if empty (use \`[]\` for empty arrays).
- \`type\` and \`action\` are single values (strings), not arrays. Use empty string \`""\` if not applicable.
- \`relationship\`, \`topics\`, and \`noise\` are arrays.
- \`relationship\`, \`topics\`, and \`filter\` are arrays.
- The \`action\` field only accepts: \`action-required\`, \`urgent\`, \`waiting\`, or \`""\`. Never use \`fyi\` as an action value.
- Use the exact label values from the taxonomy do not invent new ones.
- The \`labeled_at\` timestamp should be the current time in ISO 8601 format.
- Process all files in the batch. Do not skip any unless they already have frontmatter.
- **Noise labels are skip signals.** If an email is clearly a newsletter, cold outreach, promotion, digest, receipt, notification, or other noise label it as such. These emails will NOT create notes.
- **When in doubt between noise and a real relationship/topic, ask:** "Would a busy startup founder want a note about this in their system?" If no, it's noise.
- **Noise labels are skip signals.** If an email is clearly a newsletter, cold outreach, promotion, digest, receipt, notification, or other noise label it in the \`filter\` array. These emails will NOT create notes.
- **Noise tags coexist with other tags.** An email from your team about salary (\`relationship: ['team']\`, \`topics: ['finance']\`) that is just a payroll confirmation should ALSO have \`filter: ['receipt']\`. The noise tag overrides — it ensures the email is skipped even when relationship/topic tags are present.
- **When in doubt, ask:** "Does this email change any decision, require any follow-up, or update a relationship I need to track?" If no, it's noise add the appropriate filter tag.
`;
}

View file

@ -70,22 +70,23 @@ const DEFAULT_TAG_DEFINITIONS: TagDefinition[] = [
{ tag: 'followup', type: 'email-type', applicability: 'both', noteEffect: 'create', description: 'Following up on a previous two-way conversation (both parties have engaged). A cold sender bumping their own unanswered email is NOT a followup — it is cold-outreach.', example: 'Following up on our call last week. Have you had a chance to review the proposal?' },
// ── Noise — all skip signals in one place ─────────────────────────────
// NOTE: Noise tags override relationship/topic tags. An email can have
// relationship: team AND filter: receipt — the noise tag wins and skips note creation.
{ tag: 'spam', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Junk and unwanted email, including Google Groups spam moderation digests (from noreply-spamdigest)', example: 'Congratulations! You\'ve won $1,000,000...' },
{ tag: 'promotion', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Marketing offers, sales pitches, and product launches', example: '50% off all items this weekend only!' },
{ tag: 'cold-outreach', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Unsolicited contact from someone you don\'t know — includes sales pitches disguised as follow-ups, strangers asking for your time/feedback, and service providers you never engaged with', example: 'Hi, I noticed your company is growing fast. I\'d love to show you how we can help with...' },
{ tag: 'newsletter', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Newsletters, digests, and subscription emails', example: 'This week in AI: The latest developments in agent frameworks...' },
{ tag: 'notification', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Automated alerts and system notifications with no action needed', example: 'Your password was changed successfully. If this wasn\'t you, contact support.' },
{ tag: 'promotion', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Marketing offers, sales pitches, product launches, event invitations you did not register for, startup program upsells, vendor upgrade campaigns, and webinar/workshop invitations from companies', example: 'Register Now! Experts talk live: AI, Marketplace, Architecture & GTM Sessions Coming Up' },
{ tag: 'cold-outreach', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Unsolicited contact from someone you have no prior engagement with — includes design agencies, compliance firms, content/copy writers, dev shops, freelancers offering free work, trademark services, company closure services, hiring platforms, and anyone pitching a service with "exclusive YC deal" or referencing your YC batch. Even if they mention your company by name or offer something free.', example: 'Ramnique, $2000 worth YC Design deal every month — we work with 230+ YC founders' },
{ tag: 'newsletter', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Newsletters, industry reports, subscription emails, product tips/tutorials from vendors, and research digests — even from platforms you actively use', example: 'Report: $1.2T in combined enterprise AI value — but what\'s actually built to last?' },
{ tag: 'notification', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Automated system messages requiring no decision: email verifications, meeting recording uploads, platform policy/permission changes, billing console updates, password resets, and expired OTPs', example: 'Meeting records: your recording has been uploaded to Google Drive.' },
{ tag: 'digest', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Community digests, forum roundups, and aggregated updates', example: 'YC Bookface Weekly: 12 new posts this week...' },
{ tag: 'product-update', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Product changelogs, feature announcements, and vendor marketing', example: 'Introducing our new AI-powered dashboard...' },
{ tag: 'receipt', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Transactional receipts, invoices, and billing confirmations with no follow-up needed', example: 'Payment of $49.99 received. Thank you!' },
{ tag: 'product-update', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Product changelogs, feature announcements, and vendor marketing disguised as tips', example: 'Discover more with your Upstash free account — popular use cases inside' },
{ tag: 'receipt', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Completed transaction confirmations with no decision remaining: payment receipts, salary/payroll disbursements, tax payment acknowledgements (challans), GST/VAT filing confirmations (GSTR1 ARNs), TDS workings, recurring invoice-sharing threads, and transfer-initiated confirmations', example: 'Challan payment under section 200 for TAN BLXXXXXX4B has been successfully paid.' },
{ tag: 'social', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Social media notifications', example: 'John Smith commented on your post.' },
{ tag: 'forums', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Mailing lists and group discussions', example: 'Re: [dev-list] Question about API design' },
{ tag: 'forums', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Mailing lists, group discussions, and Google Groups moderation digests that are not spam digests', example: 'Re: [dev-list] Question about API design' },
{ tag: 'scheduling', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Calendar invites, meeting reminders, and scheduling confirmations', example: 'Reminder: Team standup in 15 minutes.' },
{ tag: 'fyi', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Informational only, no action needed', example: 'Just wanted to let you know the deal closed. Thanks for your help!' },
{ tag: 'travel', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Flights, hotels, trips, and travel logistics', example: 'Your flight to Tokyo on March 15 is confirmed. Confirmation #ABC123.' },
{ tag: 'shopping', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Purchases, orders, and returns', example: 'Your order #12345 has shipped. Track it here.' },
{ tag: 'health', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Medical, wellness, and health-related matters', example: 'Your appointment with Dr. Smith is confirmed for Monday at 2pm.' },
{ tag: 'learning', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Courses, webinars, and education marketing', example: 'Welcome to the Advanced Python course. Here\'s your access link.' },
{ tag: 'learning', type: 'noise', applicability: 'email', noteEffect: 'skip', description: 'Courses, webinars, workshops, knowledge sessions, and education marketing — even from platforms you are enrolled in', example: 'Welcome to the Advanced Python course. Here\'s your access link.' },
// ── Action — urgency signals (all create) ─────────────────────────────
{ tag: 'action-required', type: 'action', applicability: 'both', noteEffect: 'create', description: 'Needs a response or action from you', example: 'Can you send me the pricing by Friday?' },