Skip to content

Refactoring Legacy Code with AI: From Spaghetti to Clean in 30 Minutes

Turn 500 lines of spaghetti code into clean, tested modules in 30 minutes. The AI-assisted refactoring workflow that actually works.

Reading time: 6 minutes Category: Workflow & Productivity Published: January 11, 2026

The Legacy Code Problem

You inherit:

  • 500-line functions
  • Zero tests
  • No documentation
  • Mysterious bugs
  • Fear of changing anything

Traditional refactoring: 3-5 days of careful work, high risk

AI-assisted refactoring: 30-60 minutes, tested and verified

The 5-Step AI Refactoring Workflow

Step 1: Understand (5 min)

Don’t start refactoring blind. Ask AI:

Analyze this function and explain:
1. What it does (high level)
2. Main responsibilities (should be 1, probably 5+)
3. Hidden dependencies
4. Potential bugs
[paste legacy code]

AI reveals:

This function does 7 things:
1. Validates user input
2. Calls database
3. Transforms data
4. Sends email
5. Logs events
6. Updates cache
7. Returns response
⚠️ Violates Single Responsibility Principle
⚠️ No error handling on DB call (line 47)
⚠️ Email sending blocks response (performance issue)

Now you know what you’re dealing with.

Step 2: Generate Tests (10 min)

Before touching code, lock in current behavior:

Generate comprehensive tests for this function that verify:
- All current behaviors (even bad ones)
- Edge cases
- Error scenarios
[paste legacy code]
Framework: Jest

AI generates:

  • 15-20 test cases
  • Covers existing behavior
  • Baseline for refactoring

Run tests: All should pass (even for bad code)

Step 3: Extract Functions (8 min)

Ask AI to break it down:

Refactor this into smaller functions following Single Responsibility:
1. Extract validation → validateUserInput()
2. Extract DB logic → fetchUserData()
3. Extract email → sendWelcomeEmail()
4. Extract logging → logUserAction()
5. Keep main function as orchestrator
Keep same external behavior.
[paste code]

AI returns:

// Before: 500 lines, 1 function
function processUser(data) { ... }
// After: 5 focused functions
function validateUserInput(data) { ... }
function fetchUserData(userId) { ... }
function sendWelcomeEmail(user) { ... }
function logUserAction(action, user) { ... }
function processUser(data) {
const validated = validateUserInput(data);
const user = fetchUserData(validated.id);
logUserAction('process', user);
sendWelcomeEmail(user);
return user;
}

Run tests again: Should still pass

Step 4: Add Error Handling (5 min)

Add proper error handling to these functions:
- validateUserInput: throw ValidationError
- fetchUserData: handle DB failures
- sendWelcomeEmail: handle email failures (don't block)
[paste refactored code]

AI adds:

  • Try-catch blocks
  • Specific error types
  • Non-blocking email (async)

Run tests: Update tests for new error handling

Step 5: Optimize & Document (2 min)

Final pass:
1. Add JSDoc comments to each function
2. Identify performance improvements
3. Suggest typing (TypeScript)
[paste code]

Done: Clean, documented, tested code

Real Example: API Endpoint Refactor

Before (Legacy Code)

// 280 lines, no tests, no types
app.post('/api/users', async (req, res) => {
// Validation (40 lines)
if (!req.body.email) { ... }
if (!req.body.name) { ... }
// ... 38 more lines
// Database (60 lines)
const connection = await db.connect();
const user = await connection.query(...);
// ... 58 more lines
// Email sending (80 lines)
const transporter = nodemailer.createTransport(...);
await transporter.sendMail(...);
// ... 78 more lines
// Logging (50 lines)
const logger = new Logger();
logger.info(...);
// ... 48 more lines
// Response (50 lines)
res.json({ ... });
});

Problems:

  • Impossible to test
  • Impossible to reuse
  • Impossible to understand
  • One bug breaks everything

After (AI-Refactored)

30 minutes later:

users/validation.ts
export function validateUserInput(data: unknown): ValidatedUser {
const schema = z.object({
email: z.string().email(),
name: z.string().min(2)
});
return schema.parse(data);
}
// users/repository.ts
export async function createUser(user: ValidatedUser): Promise<User> {
return await db.users.create({ data: user });
}
// users/notifications.ts
export async function sendWelcomeEmail(user: User): Promise<void> {
// Non-blocking email
emailQueue.add({ user, template: 'welcome' });
}
// users/logger.ts
export function logUserCreation(user: User): void {
logger.info('User created', { userId: user.id });
}
// routes/users.ts (orchestrator)
app.post('/api/users', async (req, res) => {
try {
const validated = validateUserInput(req.body);
const user = await createUser(validated);
// Non-blocking operations
sendWelcomeEmail(user);
logUserCreation(user);
res.json({ user });
} catch (error) {
if (error instanceof ZodError) {
return res.status(400).json({ error: error.errors });
}
res.status(500).json({ error: 'Internal error' });
}
});

With tests:

describe('validateUserInput', () => {
it('accepts valid user data', () => { ... });
it('rejects invalid email', () => { ... });
// ... 15 more tests
});
describe('createUser', () => {
it('creates user in database', () => { ... });
it('handles duplicate email', () => { ... });
// ... 10 more tests
});

Results:

  • ✅ 280 lines → 120 lines (with tests)
  • ✅ 1 file → 5 focused modules
  • ✅ 0 tests → 25 tests
  • ✅ No types → Full TypeScript
  • ✅ 30 minutes to refactor

Prompt Library for Refactoring

Understanding Legacy Code

Analyze this code:
1. What does it do?
2. How many responsibilities?
3. Hidden dependencies?
4. Potential bugs?
[code]

Generating Safety Net Tests

Generate tests that verify current behavior:
- All code paths
- Edge cases
- Current bugs (mark as "legacy behavior")
Framework: [Jest/Mocha/etc]
[code]

Extracting Functions

Extract these responsibilities into separate functions:
1. [responsibility 1] → [functionName1]
2. [responsibility 2] → [functionName2]
Keep external behavior identical.
[code]

Adding Error Handling

Add production-ready error handling:
- Specific error types
- Graceful degradation
- Logging
- User-friendly messages
[code]

TypeScript Migration

Convert to TypeScript:
- Infer types from usage
- Add interfaces for data shapes
- Type all function signatures
[JavaScript code]

Time Comparison

TaskManualAI-AssistedSaved
Understand code30 min5 min25 min
Write tests60 min10 min50 min
Extract functions45 min8 min37 min
Add error handling30 min5 min25 min
Add types/docs25 min2 min23 min
Total190 min30 min160 min

84% time savings

Common Pitfalls

❌ Refactoring Without Tests Always generate tests first. They’re your safety net.

✅ Tests → Refactor → Verify

❌ Changing Behavior During Refactor Refactoring = same behavior, better structure

✅ Keep behavior identical (improve it in a separate PR)

❌ Trusting AI Blindly AI can miss edge cases or introduce bugs

✅ Review, test, verify every change

Success Checklist

After refactoring, verify:

  • □ All original tests still pass
  • □ New tests cover extracted functions
  • □ Each function has one clear purpose
  • □ Error handling is comprehensive
  • □ Code is documented
  • □ No performance regressions

Advanced: Refactoring Strategy

For very large legacy code (1000+ lines):

Week 1: Add Tests

  • Don’t change code
  • Just add characterization tests
  • Lock in current behavior

Week 2: Extract Pure Functions

  • Start with functions that don’t have side effects
  • Easy to test, low risk

Week 3: Extract Stateful Logic

  • Database, API calls, etc.
  • Higher risk, test thoroughly

Week 4: Clean Up Main Function

  • Now it’s just orchestration
  • Easy to understand

Next Steps

Today:

  1. Find one 200+ line function
  2. Run it through Step 1 (Understanding)
  3. Generate tests

This Week:

  1. Complete full 5-step refactor
  2. Measure time vs manual approach
  3. Share results with team

Related:

Start now: Find the scariest function in your codebase. Paste it into AI with “Analyze this code: 1. What does it do? 2. How many responsibilities?” You’ll be surprised what you learn.