A 2018 Google PM Is Worth Less Than a Two-Year AI-Native Builder

TL;DR

The hiring signal for product leadership has inverted. Two years of AI-native building now outperforms six years of prestige FAANG work on almost every interview dimension worth measuring.
The old interview asked "what did you ship?" The new interview asks "what tools do you use, show me your last eval suite, walk me through a prompt you refactored this week."
If your screening funnel still treats ex-FAANG as a positive signal by default, you're actively selecting against the skill profile you need.

I've interviewed candidates this year who spent six years at companies most hiring managers would call dream brands. Meta. Google. Amazon. Stripe. Many of them interview poorly for the roles I've needed to fill. Not because they're bad operators. Because the skill profile their former employers developed in them was calibrated for an operating model those companies have already moved on from.

At the same time I've interviewed candidates with two years of experience at a vertical SaaS startup nobody outside their industry has heard of. Several of those candidates could do in a working session what the senior FAANG candidates needed a month and a team to approximate. The difference wasn't intelligence. It was recency.

The pedigree signal has inverted, and most hiring pipelines haven't caught up.

Why the pedigree signal inverted

The pedigree signal inverted because AI tools change faster than multi-year FAANG reps can keep pace with, and the operating model those reps were calibrated for has already been retired by AI-native companies. The brand still reflects real rigour. The rigour was calibrated for a different game.

For twenty years, the FAANG stamp on a resume meant something specific: the candidate survived a high-bar hiring process, operated in systems at scale, and was calibrated by peers who were themselves strong. That was genuine information. It still is.

What changed is what those stamps are stamping. A candidate who spent 2016–2022 at Google running user research, shipping roadmap items through a waterfall spec process, and optimising metrics dashboards was getting excellent reps at a specific operating model. That operating model doesn't exist in AI-native companies. The reps don't transfer cleanly.

Three forces explain the inversion:

Time-scale compression. AI tools are changing monthly. A two-year AI-native builder has done roughly eight major tool-cycle rotations. A senior FAANG PM who adopted Cursor six months ago has done one. Recency compounds faster than tenure when the underlying stack changes quarterly.
Operating-model drift. Big-tech PM work from 2016–2022 was calibrated for a world where you wrote a PRD, a team of engineers built it, and you measured the result six weeks later. AI-native work is calibrated for a world where you prototype in the morning, evaluate against real examples by lunch, and ship or kill by the end of the day. The meta-skill (running a fast loop) is not the meta-skill your FAANG reps developed.
Absence of reps on the things that now matter. Eval design. Token-cost reasoning. Agent failure-mode debugging. Prompt refactoring. A candidate who hasn't done these in a real production context cannot credibly be screened for them. FAANG tenure didn't provide the reps because the tools didn't exist at the scale they do now.

None of this makes ex-FAANG candidates bad hires. Many are excellent. It makes the brand itself a weak proxy. Which is exactly the situation in which hiring managers over-weight it, because they're substituting a familiar heuristic for the harder work of assessing the actual skill.

What the 2018 Google PM looks like in the interview

Some specific patterns I've watched. None of these candidates are invented; all of them interviewed for real roles I was hiring for or advising on in the last nine months.

The first candidate was a senior PM at Google for four years, with two years at a mid-size company before that. Sharp. Articulate. Excellent at frameworks. When I asked about the last AI feature they shipped, the answer was "I worked with the applied AI team to define requirements for a RAG pipeline." When I asked what RAG configuration the team chose and why, the answer was a reasonable but generic recap of what RAG is. When I asked whether they'd run an eval on the output, the answer was "our science partners handled that."

Every individual answer was honest. The composite picture was a PM who had never closed the loop on an AI feature themselves, in a role that now requires closing that loop weekly.

The second candidate was a PM at a sub-50-person vertical SaaS company I'd never heard of. She talked me through the prompt architecture for three features in thirty minutes. She pulled up her own eval harness, showed me the seed examples, and explained why she'd moved from gpt-4 to a smaller open-weight model and taken a 2-point quality hit for a 70% cost reduction. When I asked what failure modes she'd seen in production, she had three, and a specific mitigation for each.

She was making $80K less than the Google PM was asking for. She would have been materially more productive on day one. The salary gap reflected the old pedigree economics, not the actual skill gap.

The third candidate was between the two. Ex-Atlassian, now two years at an AI-first startup. Could do the work. Understood the tool stack. His answers to AI-specific questions were sharper than the Google candidate's by a wide margin, roughly on par with the vertical SaaS candidate. The Atlassian stamp on his resume was irrelevant to his fit; what mattered was the two years after Atlassian.

This is the inversion in concrete form: the brand signal on the resume isn't correlating with the interview performance. For the AI-specific dimensions, it's actively anti-correlated in the sample I've interviewed.

What the new interview looks like

The old PM interview leaned heavily on hypothetical product cases, stakeholder simulations, and behavioural questions about past wins. Those still have a place. They aren't load-bearing any more.

The interview that differentiates 2026 candidates has shifted to six specific tests. I run some version of all of them when I'm hiring.

Tool stack recency. What do you have open right now? Share your screen. Walk me through your last prompt-refactoring session. If the answer is "I have ChatGPT open sometimes," you've ended the interview. A serious candidate has Claude, Cursor, or an equivalent open during their workday and can show me what's in the tab.

Shipped-feature walkthrough. Pick the most recent AI feature you were the driver on. Walk me through the architecture. Why RAG? What chunk size? Which model? What's the fallback when the model fails? If the candidate can't answer three of these four, they weren't actually driving the feature, regardless of what their resume says.

Eval literacy. Have you built an eval harness? Show me. If not, describe one you'd build for a specific feature I name. Specificity of the response tells you the experience level faster than any behavioural question. Candidates who've done this default to concrete examples immediately. Candidates who haven't lean on framework language.

Unit economics reasoning. What does your product cost per request? Walk me through the rough math. When did you last look at token spend for a feature you owned? Candidates who have never had responsibility for this can't fake it; the numbers feel wrong when they speak them.

Live prototyping. Spend 45 minutes in a working session. Task: prototype a feature the candidate hasn't seen before using AI tools they know. Not "code review their pull request." Actual shipping-adjacent work. Watch how they handle getting stuck. The signal is overwhelming and impossible to rehearse.

Failure-mode postmortem. Describe an AI feature that failed in production and what you did about it. Depth and specificity separate real operators from observers. Candidates who've never lived through a production AI failure can recognise one in the wild but can't describe the internals. The difference is obvious within two minutes.

These six tests take about two hours across two rounds. That's roughly the same time budget as a traditional PM loop. The difference is what they surface.

The FAANG screen problem

The most common hiring mistake I see right now is source-level filtering on prestige. "Only consider candidates from top-tier tech." "Minimum five years at a product-led company." "Bonus if they've worked with top AI teams."

Those filters made sense in 2019. In 2026, they exclude the candidates most likely to do the work well. A founder I've been advising in Brisbane filtered 140 applicants to a shortlist of eight using a pedigree screen. I asked him to re-run the screen on the rejected pool using recency of AI tool use as the primary filter. The recombined shortlist overlapped with his original list on exactly two names. The other six in his original shortlist were indistinguishable from the rest of the FAANG pool on AI-specific dimensions. The six candidates I'd added were materially better on those dimensions.

Six of the eight finalists under his original process were, in the end, the wrong eight. Not because any of them were bad. Because the screen was selecting for a prior-generation skill profile.

The adjustment isn't subtle. It's to stop using FAANG tenure as a positive signal and start using it as a neutral signal that requires independent verification on the AI-specific dimensions. The candidates who pass both filters (good big-tech reps AND recent AI work) are exceptional. The candidates who pass only the brand filter are increasingly common and increasingly over-priced.

What candidates should hear from this

If you're on the other side of this shift, the read is different. The brand on your resume is losing value relative to the shipping receipts you can produce. Two implications:

If you have the brand but not the receipts, build the receipts now. A weekend side project with AI tools, shipped to real users, with an eval harness you wrote yourself, does more for your next interview than three more quarters of incremental work at your prestige employer. The hiring manager doesn't care that it's a side project. They care that it's recent, that you can walk them through it, and that you actually did the work. This is what vibe coding looks like when it's real; it's not code golf, it's shipping.

If you have the receipts but not the brand, stop apologising. Some hiring pipelines will still filter you out; those pipelines are selecting against their own interest, and the pool of hiring managers who know this is growing quarterly. Write your resume around the AI features you shipped, the eval harnesses you built, and the production incidents you resolved. Lead with the concrete. Let brand-hungry pipelines self-select out of your funnel; they are not where the best work is happening.

The labour market will clear this eventually. Compensation will adjust. The skill-density arbitrage visible in hiring right now is large enough that the first movers (both companies and candidates) capture most of the excess return. This is the same pattern as the great talent swap happening at big tech, just expressed at the individual hire level.

The uncomfortable reframe for hiring managers

The question most hiring managers want to avoid asking is whether they, personally, could pass the new interview they'd need to run. Could you answer the eval literacy question? Do you know the token cost of the feature you greenlit last month? Have you shipped something with AI tools in the past 60 days yourself?

If the answer is no, the problem isn't your candidate pool. It's that you can't tell the difference between a good candidate and a great one on the dimensions that matter. A builder-leader can run this interview; a strategist who hasn't touched the tools cannot, regardless of how smart they are about product in general.

That's the uncomfortable part of the inversion. It doesn't just reprice the candidate market. It reprices the hiring manager as well. The FAANG brand on your resume is worth less than it used to be, but so is the FAANG brand on your own.

The fix for both is the same: close the gap by doing the work, not by reading about it. The people doing this in both seats are the ones rebuilding the hiring economy around skill, not signal.

Frequently Asked Questions

Isn't this just anti-FAANG bias dressed up as insight?

No. The claim isn't that FAANG candidates are worse. It's that FAANG tenure is a weaker predictor than it used to be, because the reps developed there during the 2016–2022 window are calibrated for an operating model that AI-native companies have moved past. FAANG candidates who've done recent hands-on AI work still pass both filters and are exceptional hires. The shift is in how much weight to place on brand alone.

How do I assess eval literacy without being an expert myself?

Ask the candidate to walk you through one eval harness they've built, slowly. You don't need to know the internals to tell the difference between a real answer and a rehearsed one. Real answers have specifics (which examples, why those, what broke), particulars, and readiness to go deeper. Rehearsed answers default to framework language and collapse under a specific follow-up question. The specificity signal is the signal.

What if I'm hiring for a non-builder senior role, like VP Product?

Same logic, higher stakes. A VP Product who can't reason about evals, tokens, and shipping cycles will systematically mis-hire for every role under them. The filter is weaker for responsibility breadth but stronger for hands-on credibility. The VPs who thrive in 2026 can do the six-test interview as candidates themselves, not just as interviewers.

Does this apply to senior engineering hires?

More so. The pedigree inversion is actually sharper on the engineering side because the delta between AI-native and AI-adopted engineering practice is larger than the delta on the PM side. An ex-FAANG senior engineer who last shipped production code in 2022 is almost invariably outperformed in working sessions by a three-year AI-native engineer. Same filter logic, more pronounced effect.