You are viewing archived messages.
Go here to search the history.

Walker Griggs 2025-04-22 17:20:41

"Should technical candidates be allowed to use AI assistance in an interview?"

I've had the debate many times over the last two years. My initial stance has always been "of course not, I want to evaluate if they actually understand programming fundamentals." My stance here might be softening. Common responses I hear are

  • "This is the way people write code now and we should assess candidates in as close to 'real world' conditions as possible"
  • "Where is line between syntax highlighting, LSPs, and AI code-completion?"
  • "It should be obvious when a candidate doesn't understand the code they're generating"
  • "Cursor boosts your output; a productive engineer should always leverage the best tools"

My responses to those points vary from "LSPs don't write the code on your behalf", "code completion operates on syntax and not semantics", and "human-in-the loop reduces critical reasoning." I personally find it difficult to discern meaningful signal around a candidates level of understanding while they tap tab. What do you all think?

Andrew F 2025-04-22 18:05:11

I'm a stick in the mud, but I still say no. At least, there should be a phase where you see if they can actually write code, even if there's also a stage where they can use AI.

(The "where is the line" complaint is bordering on bad faith by the way. Autocomplete and syntax highlighting are based on glorified grep. The line is plenty clear if you don't have an interest in obscuring it.)

Kartik Agaram 2025-04-22 18:27:13

Interviews are a shitshow and largely throwing spaghetti on the wall to see what sticks. They're not completely random lotteries, but they have a significant component of chance. They are also obviously robust to mistakes; I see bad candidates hired all the time and yet the world moves on. Therefore, speaking from zero experience and reasoning purely from first principles (since this is a forum on the Internet):

It's not clear to me that decisions along this one dimension will dramatically change the quality of outcomes.

Marek Rogalski 2025-04-22 18:49:34

Define what you want to measure first. Once you do that it should be easier to say how AI affects the SNR of the measured property. IMO for measuring "coding skills" or "problem solving" AI obscures the candidate's skill. It might be interesting to use a system prompt that asks the AI to help in measuring the candidate's knowledge in specific areas, for example by deliberately injecting bugs (the candidate should be aware of this though).

Walker Griggs 2025-04-22 19:50:04

I do agree that technical interviews are often if not always misaligned from actual work product. At their worst, interviews are a pile of leetcode you'll never look at again. At their best, they're a toy piece of code that is intended to extract some form of signal.

I've started to narrow in on "code with AI allowed" and "debug / code review a buggy piece of code without AI." Thought being: the latter will evaluate instincts, intuition, informal reasoning etc while the former assess ability to problem solve and posit solutions, What this doesn't necessarily test is code architecture and organization which is... probably important?

Bart Agapinan 2025-04-22 20:38:24

"Should students be able to use a calculator on a math test?"

I agree with Marek that you have to first define what it is you are looking to measure. And I feel like many interviewers have not done that work so then the interview ends up being a "feeling."

Are you looking for rote memorization of algorithms/leetcode exercises, are you looking for someone who can level up your junior engineers because they're good at explaining things, someone who won't argue with your hard to work with VP because they don't ask tough questions so you won't have to backfill that position again in another 6 months, etc.

Jimmy Miller 2025-04-22 20:43:56

I've never understood how people find live coding interviews to be helpful. Is that really the conditions you need a candidate to perform well? I am a lousy typer. I am not the fastest programmer. My method is always to start messy and explore and refine code with time. I am bad at speaking out loud to someone I don't know and at the same time also thinking about my problem.

If you are going to insist on such an artificial, imperfect test of skill, why try and restrict it even further? Why not let them have the tools they'd actually use and see how they work?

But I really think these kinds of technical interviews are completely misguided. They aren't made to try and figure out the candidates skills but to see if they are weak in one particular problem you happen to choose. It is especially bad when the interviewer knows the problem well, because they see the solution as obvious and anyone who doesn't get it isn't as smart as them.

I try to make my interviews to focus on someones strengths. Is there background databases? Let's focus on that. Is there background frontend? Let's chat about that. Do they have a bunch of code on their github? Let me do the work to understand it and talk to them about that. I want to learn about a candidate. I want to find out what they care about, what they know deeply, what they are interested in. And I want them to understand the needs and challenges we have and see if they have some ability to help. Asking someone to live code for you is never going to help you figure that. Restricting them to an unfamiliar environment is going to make it even less likely that you get to know them.

Karl Toby Rosenberg 2025-04-22 21:41:09

autocomplete (reducing typing an memorization) isn’t the same as some ai generating solutions to problems you’re meant solve using your own thought process. It’s grunt work vs creativity. Ignoring that the live coding is flawed, but if you use AI you’re turning the interview into “can you be a compiler that checks for errors?”

eh…

I agree with the above too. I’m slow and iterative, and I need to try a few things. That’s not eary to come out in an interview at all and it might look like what I’m doing is bad if the measurement is just raw correctness and speed.

Ivan Reese 2025-04-23 03:25:45

Personally, I think the premise is flawed. I don't believe that the person you interact with in the interview bares even a passing resemblance to the person they'll be after working with you for a month. If they have a modicum of interest in programming, and seem enthusiastic, then just give them the benefit of the doubt and do some work together. You'll each learn a lot.

I guess another way of putting this: I don't believe in fixed vs growth mindset in general, but I strongly believe in something similar existing when it comes to working with people.

Walker Griggs 2025-04-23 17:22:31

I agree that technical interviews are a weak approximation of holistic ability and generally contain mixed signals. What's worse is when those misleading signals are interpreted or factored into the assessment -- typing, code style, etc. I do worry it's easy to throw the baby out with the bath water, though. Without some indication of work product, you're stuck evaluating a candidate on 'vibes'. My favorite way to evaluate a candidate is to look at existing work product. Have they contributed to an OSS project and are those contributions something you'd be excited to receive. If so, done and dusted. That's not always an option though when someone has contributed to closed-source projects the bulk of their career.

I think Marek's response is the most accurate: "define what you want to measure first"

I don't believe that the person you interact with in the interview bares even a passing resemblance to the person they'll be after working with you for a month.

Absolutely! But I don't think these narrow technical interviews are not meant to assess the person holistically. If you're looking for a broad signal using a narrow test, you're likely doing the candidate a disservice and not setting them up for success. But if you're focusing on narrow goals -- does the candidate have a grasp of core fundamentals required to operate on relevant systems -- I think there still is meaningful signal to extract. Question is: how? Related, I think the responses "I'm a slow typist" and "it's difficult to talk through intent with new people" are orthogonal to the issue. A qualified interviewer isn't assessing typing speed and should factor in nerves.

If you are going to insist on such an artificial, imperfect test of skill, why try and restrict it even further?

I keep coming back to thinking about AP CompSci or similar academic courses. Most of those are still pencil + paper which feels like an extreme restriction. I have to assume the goal is to extract a candidates core understanding of underlying principles without assistance. The counter argument, the content you're assessing is targeted and narrow in contrast to a job interview.

To answer the question directly: in my opinion, you might restrict the test further if the inclusion of those restrictions would further muddle or hide the signals you intend to measure. Brings it back to: "define what you want to measure first"

"Should students be able to use a calculator on a math test?"

This absolutely feels like a faulty parallelism. You still need to understanding what you're typing into a calculator to get an accurate response.

Scott 2025-04-23 17:34:20

I was actually just having this conversation with someone last night, and I think the "standard" interview questions of the past have kind of been used incorrectly as "here's a problem, get it right you pass, get it wrong you fail" when that doesn't matter that much - what you're really trying to assess is something closer to what Ivan is getting at: "do they have an enthusiastic interest in programming?" and "have they kept up with what the latest is with people who have enthusiastic interests in programming?", which has changed a lot over time.

At one time it was working with pointers, later it was working with hashes/hash maps, these days its a lot of functional/collection-based concepts now that a lot of languages hashes are a day 1 concept...

For me, the answer to your question is that how comfortable a candidate is at using AI in the interview is the primary thing I'm looking at. What techniques have they come up with or invented, how well do they know the different tools they're choosing and so on

Karl Toby Rosenberg 2025-04-23 17:44:26

I’ve always struggled with interviews, even on easyish questions. I can’t really talk with someone while coding or get the right answer straight away. When I actually code for myself, I’ll sort of vibe-out (not vibe code) and try molding what I’m doing.

Just… super unnatural to think out-loud because if I’m explaining out-loud, I’m not thinking. I’m in some zombie presenter state. Parallel processing issue?

I wish I could pass some of these interviews.

shameless plug: looking for collaborations on creative cool things. Feelings of computing indeed

Bart Agapinan 2025-04-23 17:48:55

"Should students be able to use a calculator on a math test?"

This absolutely feels like a faulty parallelism. You still need to understanding what you're typing into a calculator to get an accurate response.

Well, yes, of course, which is exactly my point. If you want to test someone's ability to hand calculate, you have to remove the calculator. But if you want to test if someone knows how to approach answering the question, regardless of their ability to calculate, you probably allow a calculator (I feel like I might have been a civil engineer except for my tendency for hand written calculations in college to have an off-by-one or switched sign error, something that a computer would not have done - I was using the correct formula... but probably I'm just bitter about my own past mistakes)

I actually think using an LLM can potentially tell you more about how a candidate thinks, since it may generate a solution that needs to be refactored or tweaked because it doesn't handle all of the edge cases you'd want, which is something you wouldn't see in a standard technical interview. But it's also something you have less control over as an interviewer, since you don't know what the LLM will produce.

In any case, I think this question will be less and less relevant in the future as LLMs and tools improve and eventually there will just be an expectation that everyone uses them.

Walker Griggs 2025-04-23 17:50:32

For me, the answer to your question is that how comfortable a candidate is at using AI in the interview is the primary thing I'm looking at.

Interesting, so you like to double down and evaluate the candidates use of AI as a tool in the modern toolbelt? That's aligned with the feedback I've heard of "a productive engineer should always leverage the best tools".

At one time it was working with pointers, later it was working with hashes/hash maps, these days its a lot of functional/collection-based concepts now that a lot of languages hashes are a day 1 concept...

I often think about this too! My framing is generally in terms of "abstraction layers". I like to ask myself "does this candidate understand one or two layers below the surface" or "could they explain how N is meant to work." Sorta like Gary Bernhardt's "Destroy All Software" project. As we add new abstraction layers though, the relative "fundamentals" follow as well

Karl Toby Rosenberg 2025-04-23 17:51:41

The calculator analogy doesn’t work and I’ve seen it so many times.

I think using an LLM is harder than just solving it on your own because now you have to debug the output and check for errors in something that you might not even fully understand. I’d want to evaluate someone based on how they design a solution to a problem end to end, even if there are fumbles.

I wouldn’t expect everyone to use these since they can easily introduce technical debt.

If anything, you could introduce using a model/any automation tool as part of an interview to see how someone adapts to learning a new tool. See how someone pokes at something.

But in the end I want to know how someone is when they’re thinking through something with me, as if the tools and supports weren’t there. (I wouldn’t consider an llm a great support for many problems anyway.) An ideal interview is collaborative towards solving a problem and designing something.

Walker Griggs 2025-04-23 17:55:57

"Should students be able to use a calculator on a math test?"

This absolutely feels like a faulty parallelism. You still need to understanding what you're typing into a calculator to get an accurate response.

Well, yes, of course, which is exactly my point. If you want to test someone's ability to hand calculate, you have to remove the calculator. But if you want to test if someone knows how to approach answering the question, regardless of their ability to calculate, you probably allow a calculator

I see what you're getting at! Sorry I think I misunderstood your initial response. This framing on aligns with, I think, what I just posted about "relative levels of abstractions." The "fundamentals" are relative to the current tools of art.

Do you think it's important a candidate understands an increasing relative depth as the tools become further removed from the core technology. For example, I could argue it's important for an eng to understand the trade offs between "passing by value vs reference" regardless of how advanced the tools become

Walker Griggs 2025-04-23 17:58:55

But in the end I want to know how someone is when they’re thinking through something with me

➕ I've never dinged a candidate for not completing the given problem if they've demonstrated a solid understanding of the problem, have articulated their approach clearly, laid out a framework for success etc. I suppose, this is true and possible regardless if the candidate uses AI or not

In the end of the day, it's conversation. "Does it compile and pass tests" is a stinky approach.

Karl Toby Rosenberg 2025-04-23 18:00:11

Yeah, agreed, although I think using ai skips the point of explaining your thought process. Granted, I’ve struggled with multiprocessing thinking and explaining during such interviews.

Scott 2025-04-23 18:00:19

Interesting, so you like to double down and evaluate the candidates use of AI as a tool in the modern toolbelt?

Yeah exactly, LLMs and LLM-based coding requires a much different mindset and is so new that best practices are still being discovered. I suspect it's going to be a long time before it feels like we really know what we're doing again, so someone who's curious and experimenting and testing out ideas (or keeping up with the latest ones and incorporating them into their process) would be my primary criteria right now...

As we add new abstraction layers though, the relative "fundamentals" follow as well

Yeah, it's an exciting time to see this new layer form 🙂

Bart Agapinan 2025-04-23 18:41:01

I think using an LLM is harder than just solving it on your own because now you have to debug the output and check for errors in something that you might not even fully understand. I’d want to evaluate someone based on how they design a solution to a problem end to end, even if there are fumbles.

Yeah, but here you're also describing "legacy code", which I'd assume your current code base has plenty of.

But also, I think LLMs are probably a better fit for well known domains where there's a lot of training data, and I imagine the audience of this slack may skew toward more novel domains where the LLMs just give nonsense recommendations

Karl Toby Rosenberg 2025-04-23 18:43:33

Yes, nonsense recommendations in novel domains. Also, yes, legacy code is probably comparable, but I don’t know if I’d want to evaluate someone trying to look at the equivalent of legacy code rather than coming-up with something together. An llm in the middle seems a bit weird. “Hold on interviewer, I have to inspect this code.” Leaves a bit of an awkward gap versus just showing the full thought process. I’m of course a bit guilty of disliking the common process for other reasons.

Kartik Agaram 2025-04-23 19:41:36

One thing I didn't think of when this thread was created, and now it's too late to do anything about it, but just for reference: I think this thread belongs in #present-company. Not because it's about AI, but because it's about interviews.

With that assumption I have a couple of things to say:

  • I've always punched above my weight class in coding interviews. Which is to say, employers who hired me on the basis of a coding interview usually over-estimated my capabilities. I think this one skill has contributed more to my personal bottom line than anything else. So I'm a long-time benefactor of dysfunctional interviewing practices.
  • I'm currently preparing for an interview at a place that requires a coding interview at the level of "leetcode medium or hard." Leetcode hard is really hard. There is no way in hell I can do a leetcode hard problem in 60 minutes. In fact, if I get a leetcode hard problem I suspect the interviewer is mostly going to see me hemming and hawing for 60 minutes. If I wasn't stuck at a desk in front of a window containing a camera feed I'd be lying down, or pacing the room, or tossing a ball against a wall. At least in person interviews let me scribble on a whiteboard.
  • I've had to deal with leetcode hard interviews before, but it was usually a surprise in the past, and those employers weeded themselves out of my consideration. I've never tried to train for this level. And it has given me greater empathy for others who haven't had my good luck, those who are penalized rather than advantaged by the current interview process. Like the people who need to toss a ball against a wall for leetcode medium problems. The problems themselves are fun, but they seem terrible in the context of an interview.
  • It's all very well to say companies aren't doing interviews right. But what matters is the practice, not the ideal. Companies hire engineers, not "qualified interviewers." As an industry we have no rigorous rubric for interviewers, for their knowledge, their discernment, their curiosity about a candidate, their motivation to tease apart fine detail. For the most part, doing interviews well is not a priority for companies, and it's not clear that it should be. It doesn't obviously increase their fitness to their environment. It's not clear to me that adding LLMs changes that equation. Though it's still early days and there are going to be tons of uses for LLMs that my puny brain can't even conceive of right now.
Don Abrams 2025-04-27 21:13:31

I assume in an interview that you want to evaluate how the interviewee would likely perform in a given role.

If they can use AI in their role, then yes. If not, no.

To evaluate basic coding skills, I tend to use Soloway's Recuring Rainfall and make sure they can walk through how their solution works preferably with the correct vocabulary. I really need to check if if ChatGPT can do that yet.

For architecture/system structure, I usually don't have people write code. I'm usually looking for tradeoff selection. I don't see AI helping here.

For debugging/analysis check, I usually do two things: A) have them look over a PR and provide feedback, which would be interesting to see them use AI on. B) Tackle a previous big bug in a system they don't know with another engineer, basically just making sure they can work with someone binary-search style. AI would be too specific here and fail.