When using Next.js with next-auth (auth.js) and a backend API (e.g., via tRPC), you may encounter a race condition during JWT token refresh. This typically happens when:
- Middleware checks the user's session and refreshes the access token if expired.
- Immediately after, the client (e.g., tRPC) makes a request using the (now stale) access token, triggering another refresh attempt.
- Both refresh attempts use the same old refresh token, but only the first one succeeds. The second fails with a 401 error because the refresh token has already been invalidated.
This can result in failed API requests and a poor user experience.
- User navigates to a protected page.
- Middleware checks the session, sees the access token is expired, and refreshes it.
- Navigation proceeds.
- Client-side data fetching (e.g., tRPC) immediately requests data, but still uses the old (now invalid) access token.
- The backend receives a second refresh request with the same old refresh token, which is now invalid, and returns a 401 error.
To prevent this, implement a short-lived cache (e.g., in Redis) that stores the result of a successful token refresh, keyed by the old refresh token. If a second refresh request comes in with the same token within a short window (e.g., 1 second), return the cached result instead of failing.
async function refreshTokens(oldAccessToken, oldRefreshToken) {
const raceKey = `raceConditionHelper:${oldRefreshToken}`;
const cached = await redis.get(raceKey);
if (cached) {
// Return the cached result to defeat the race condition
return cached;
}
// Validate the refresh token as usual
const valid = await redis.get(`refreshToken:${oldRefreshToken}`);
if (!valid) throw new Error('Invalid refresh token');
// Invalidate old tokens, generate new ones
const newTokens = await generateAndStoreTokens();
// Store the result in Redis for a short time (e.g., 1 second)
await redis.set(raceKey, newTokens, 1);
return newTokens;
}- First refresh request: Proceeds as normal, generates new tokens, and caches the result for 1 second.
- Second (racing) refresh request: Within 1 second (in my case even 500ms worked just fine), finds the cached result and returns it, avoiding a 401 error.
- Short TTL: The cache should expire quickly (e.g., 1 second) to avoid security issues.
- No sensitive data: Only cache what you would normally return from the refresh endpoint.
Some may worry that allowing multiple refresh attempts with the same refresh token (within a short window) could introduce security risks. However, this approach remains secure for several reasons:
- HTTPS Only: All token refresh requests are transmitted over HTTPS, preventing interception or tampering by third parties.
- Secure Storage: Next-auth stores the refresh token in a JWE (JSON Web Encryption) token, which is set as a secure, HTTP-only cookie. This means:
- The token is inaccessible to JavaScript, mitigating XSS risks.
- The token is protected from CSRF attacks.
- Short-lived Cache: The race condition helper cache is set with a very short TTL (e.g., 1 second), minimizing the window for any potential misuse.
- Token Confidentiality: Only the legitimate user’s browser has access to the refresh token. Without access to this token (meaning direct access to the user's machine), an attacker cannot successfully trigger the refresh endpoint, even if they know about the race condition helper.
If anyone does NOT agree with my approach, I am happy to hear feedback (as this is the system I am using for a big project).