שופט, לא רק יוצר: מודל השפה בתפקיד נוסף - חלק א Artwork

אנליסטים ממאדים, מנהלי מוצר מנוגה

The podcast is in Hebrew, as we felt this topic deserves a deeper conversation within the Israeli tech community :-)

This podcast explores the world of Generative AI products and what it really takes to make them reliable in real life.
Reut Amir (Head of Product, Wix Customer Care) and Ariel Yaakobi (Head of Business Analytics, Wix Customer Care) share insights from hands-on experience evaluating GenAI products at scale. The wins, the mistakes, and the methods that actually worked for us.

We explained why evaluation is one of the hardest parts of GenAI development cycle, why traditional metrics fall short, and how to measure not just whether the AI is correct, but whether it’s helpful.

The podcast is designed to help product managers, analysts, and teams take their first steps into doing GenAI evaluation themselves, with practical guidance, examples, and a real-world perspective. Enjoy :-)

All Episodes

אנליסטים ממאדים, מנהלי מוצר מנוגה

שופט, לא רק יוצר: מודל השפה בתפקיד נוסף - חלק א

January 30, 2026 • Reut Amir & Ariel Yaakobi • Season 1 • Episode 4

0:00 | 27:47

כבר ידוע ומוכח שודלי שפה מצוינים בלייצר טקסט. אבל בעולם מוצרי ג׳נרטיב איי.אי, הם משמשים גם כשופטים.

בפרק הזה אנחנו צוללים לעולם של LLM as a Judge

הגישה שבה מודלי שפה משמשים להערכת האיכות של מודלי שפה אחרים. נדבר על למה אוואליואציה אוטומטית היא חלק קריטי ממוצרי ג׳נרטיב איי.אי, נסקור סוגים שונים של הערכות, ובחלק המרכזי נציג פריימוורק פרקטי.

זה פרק שמחבר בין עולמות ה איי.אי לעולמות המוצר ועוזר להפוך איכות של מודל למשהו שאפשר לקבל עליו החלטות מוצריות, למדוד לאורך זמן, ולשפר בצורה שיטתית.

נ.ב. הטקסט הזה נכתב בעזרת אי.איי ואיכותו נמדדה על ידי אי.איי. אנחנו רק תיווכנו 😉