Performance Reviews

You are either writing a performance review, receiving one, or running a calibration session — and in all three cases, the process probably feels broken. The core tension is real: performance reviews attempt to condense months of work into a rating and a paragraph, which means they are structurally prone to bias, recency effects, and misaligned expectations. But they remain the primary mechanism for promotions, compensation, and career development at most tech companies. Here is how to make them work as well as they can.

The Feedback Gap

The most consistent finding across guests who discuss performance management: there is a large gap between what managers think they are communicating and what reports actually hear.

Kim Scott, author of Radical Candor — the single most referenced book on Lenny’s Podcast — describes the core problem. Most organizations default to what she calls “ruinous empathy”: “We do remember to show that we care personally, but we’re so worried about not hurting someone’s feelings or not offending them, that we fail to tell them something they’d be better off knowing in the long run.” The feedback is either too vague or too diplomatic to land.

The gap shows up most painfully during review cycles. A PM receives a rating they did not expect, and their first reaction is: “Why did no one tell me?” Often, someone did tell them — but the signal was buried in language so diplomatic it was inaudible.

What the Manager SaysWhat the Manager MeansWhat the Report Hears
”You could be more proactive""You are not operating at the expected level""Minor feedback, I am doing fine"
"Something to think about for next half""This needs to change for you to get promoted""Interesting suggestion, I will consider it"
"Your stakeholder management could improve""Multiple people have complained about your communication""One small area for growth”

Scott offers a concrete fix for soliciting feedback: “If you say, ‘Do you have any feedback for me?’ You’re wasting your breath. The other person’s going to say, ‘Oh no, everything’s fine.’ The question that I like to ask is, ‘What could I do or stop doing that would make it easier to work with me?‘” If your report is surprised by their review rating, the review system has failed. The review should be a written summary of conversations you have already had, not new information.

The fix is ongoing, specific feedback in 1:1s throughout the half or quarter. By the time the review is written, the rating should be obvious to both parties.

Writing Effective Self-Reviews

Most PMs underinvest in their self-review. They write it in 30 minutes the night before it is due, list a few accomplishments, and submit. This is a missed opportunity. The self-review is your primary tool for shaping how your work is perceived during calibration.

Jules Walter, who went from first growth PM at Slack to product lead at YouTube, models a key mindset for self-reviews: “If you give me feedback, I’ll be like, ‘Hey, thank you so much. This is super helpful,’ because people are like, ‘Oh, he actually likes the feedback.’ Now, inside my heart might be melting. I’m like, ‘Oh, I thought I got better at this.’ But externally, I’m like, ‘Hey, thank you,’ and I mean it.” That openness to feedback should inform how you write your self-review. Here is a specific structure:

The STAR-I Framework for Self-Reviews

For each major accomplishment:

ComponentWhat to WriteExample
SituationThe context and why this mattered”Q3 retention for new users had declined from 42% to 35% over two quarters”
TaskYour specific role (not the team’s role — yours)“I owned the diagnosis and led the cross-functional effort to reverse the trend”
ActionWhat you actually did, specifically”Ran 30 user interviews, identified onboarding friction as the root cause, designed and shipped a new first-run experience with engineering in 6 weeks”
ResultThe measurable outcome”New user D30 retention improved from 35% to 44%, adding an estimated $2.1M in annualized revenue”
Impact scopeWho benefited and at what altitude”This directly contributed to the company’s Q4 retention OKR and became the template for onboarding improvements across all product lines”

Elizabeth Stone, CTO of Netflix, models the standard: “I hold myself to a very high standard. If someone sends me something, I try to be very responsive about it. If I said I’m going to do something, I follow through on it in the timeline that I said I was going to do.” Apply that same rigor to your self-review. The most common mistake is describing activities instead of outcomes. “Shipped 14 features” tells the reader nothing about your judgment or impact. “Identified and solved the primary driver of new user churn, improving D30 retention by 9 points” tells them everything.

Two other guidelines:

  • Include one genuine area for growth. Not a humble brag (“I work too hard”). An actual skill gap you are aware of and working on. This signals self-awareness, which calibration panels value.
  • Quantify everything you can. Revenue impact, metric movement, number of customers affected, team velocity improvements. If you cannot quantify it, describe the counterfactual: “Without this work, the launch would have been delayed by an estimated 6 weeks.”

Running Calibrations

Calibration is the meeting where managers align ratings across their organization. It is where the real decisions happen, and it is where bias is most likely to distort outcomes.

How Calibration Works

Typically, a group of managers (often a director and their PM managers) meets to review every PM’s proposed rating. Each manager presents their reports’ performance, the group discusses, and ratings are adjusted until the distribution feels right.

Common Calibration Biases

BiasHow It ManifestsMitigation
Recency biasOverweighting the last month of the review periodRequire managers to bring evidence from the full period. Monthly notes help.
Halo effectA PM who is well-liked or articulate gets inflated ratingsSeparate “how they communicate” from “what they accomplished.” Ask: “What metrics moved?”
Squeaky wheel biasManagers who advocate loudly get better ratings for their reportsStructured format where each manager gets equal time. Written pre-reads.
Similarity biasManagers rate reports who remind them of themselves more highlyRequire specific, evidence-based justifications for every rating above or below “meets expectations”
AnchoringThe first rating discussed sets the bar for everyone afterRandomize the order. Do not start with the strongest or weakest performer.

Stone describes how Netflix takes a different approach entirely: “We don’t have a practice that a lot of other companies do where we would think about reflecting on a rating of how things are going. We do have an annual cycle of 360 feedback where you request and receive feedback from a lot of people, but it’s not an input to some rating.” Netflix anchors on high talent density instead — as Stone explains, “We can’t really have any of the other aspects of the culture, including candor, learning, seeking excellence and improvement, freedom and responsibility if you don’t start with high talent density.”

For companies that do run traditional calibrations, a forcing function helps: every rating above or below “meets expectations” should require written documentation of specific evidence. Not “they were great” — “here are the three things they accomplished, with metrics, that demonstrate they exceeded expectations.” This makes it harder to inflate or deflate based on vibes.

Performance Improvement Plans

A PIP (Performance Improvement Plan) is the formal mechanism for addressing sustained underperformance. It is also one of the most misunderstood and poorly executed management tools.

When to Use a PIP

A PIP should be the last step in a documented process, not the first signal that something is wrong.

A sound escalation sequence:

  1. Verbal feedback in 1:1s. Specific, actionable, with a timeline for improvement.
  2. Written feedback. If verbal feedback does not produce change within 4-6 weeks, document the gap in writing. Share it with the report. “Here is where you are, here is where you need to be, here is the timeline.”
  3. Formal PIP. If written feedback and coaching do not produce change, the PIP formalizes the expectations, timeline (typically 30-60 days), and consequences.

How to Write a PIP That Is Actually Useful

A well-written PIP serves two purposes: it gives the employee a genuine chance to improve, and it documents the process if they do not.

PIP ComponentGood ExampleBad Example
Performance gap”Your feature launch missed the target metric by 40% and post-mortem identified incomplete user research as a root cause""You need to be more strategic”
Expected behavior”Complete 10 customer interviews before writing the next PRD. Share synthesis with the team before solutioning.""Improve your product sense”
Timeline”30 days, with weekly check-ins every Monday at 10 AM""Improve over the next quarter”
Success criteria”Next feature launch hits within 10% of target metric. PRDs include user research summary with direct quotes.""Demonstrate improvement”
Support offered”Weekly coaching sessions with your manager. Access to user research team for interview scheduling.""Let me know if you need help”

The reality is that most PIPs end in the person leaving, either voluntarily or involuntarily. That does not mean the PIP is a formality — it should be a genuine attempt. But if someone has received clear feedback for months and has not improved, the PIP is unlikely to change the trajectory. The purpose is to give one final, structured chance and to ensure fairness.

The Emotional Component

PIPs are difficult for everyone involved. Kim Scott places this squarely in the radical candor framework: care personally while challenging directly. As she warns, the worst failure mode is “manipulative insincerity” — when a manager realizes they have been harsh and overcorrects by becoming vague, making the feedback useless. Be direct about the situation, be clear about what needs to change, and be genuine about wanting them to succeed.

Key Takeaway

  • If your report is surprised by their review rating, you have failed as a manager. Reviews should summarize ongoing feedback, not introduce it. Use 1:1s to close the feedback gap throughout the cycle.
  • Write self-reviews using the STAR-I framework: Situation, Task, Action, Result, Impact scope. Quantify outcomes rather than listing activities. “Improved D30 retention by 9 points” beats “shipped 14 features.”
  • Mitigate calibration bias with written pre-reads, randomized discussion order, and evidence requirements for every above- or below-expectations rating.
  • One-on-Ones — Where feedback should be delivered continuously, not saved for reviews
  • OKRs — Keep OKRs separate from performance evaluation to avoid sandbagging
  • Hiring PMs — The review rubric should align with what you screen for in hiring
  • IC vs Management Track — Review criteria differ significantly between tracks

Sources