Case study

RSS Scraper — frontend

A Next.js web app for managing website scraping jobs and RSS feed generation. Authenticated users view configured sites, trigger scrapes, preview feed items, and copy generated RSS URLs. The UI is a Backend-for-Frontend (BFF): browser calls same-origin API routes; Next.js forwards to an Express API with JWT from HTTP-only cookies — never localStorage.

Built for PEI Obituaries — Prince Edward Island news and obituary sources — with an architecture that supports any sites configured in the backend.

Next.js 16React 19TypeScriptTailwind CSS 4Radix UILucide

Architecture

System architecture

BFF keeps JWT off the client and out of localStorage

Same-origin only

Browser

React dashboard

credentials: include
/sites · /login UI
No direct API URL

httpOnly cookie

→POST /api/auth/login

←Set-Cookie: auth-token

request / response

Next.js

BFF + middleware

auth-token httpOnly cookie
/api/* route handlers
Bearer JWT on proxy

Authorization: Bearer

→GET /api/sites · POST scrape

←{ sites } · { rssUrl }

request / response

Express API

Scrape + RSS

Site config & jobs
Scrape orchestration
RSS / XML generation

Browser

React dashboard

credentials: include
/sites · /login UI
No direct API URL

httpOnly cookie

→POST /api/auth/login

←Set-Cookie: auth-token

request / response

Next.js

BFF + middleware

auth-token httpOnly cookie
/api/* route handlers
Bearer JWT on proxy

Authorization: Bearer

→GET /api/sites · POST scrape

←{ sites } · { rssUrl }

request / response

Express API

Scrape + RSS

Site config & jobs
Scrape orchestration
RSS / XML generation

Session

auth-token · 7d · httpOnly

BFF surface

/api/auth/* · /api/sites/*

Output

RSS feeds · XML on disk

Features

JWT authentication

Email/password login; session stored in an httpOnly auth-token cookie (7 days). Client state via AuthContext and /api/auth/verify.

Protected dashboard

Middleware gates every page except /login. Unauthenticated users redirect before any site data loads.

Site management

List backend-configured scraping targets with status, metadata, and last-scraped timestamps.

On-demand scraping

Trigger per-site jobs from the UI and receive updated RSS feed URLs when the backend finishes.

Feed preview

Expandable detail panels load recent items — title, link, date, description — without leaving the dashboard.

Privacy by default

robots.txt disallows crawlers; global X-Robots-Tag: noindex, nofollow; middleware blocks common bot user-agents.

Pages & BFF routes

App Router pages for auth and operations; API route handlers proxy to the backend with Authorization: Bearer <token> read from the cookie.

Route	Access
/	ProtectedLanding dashboard — PEI Obituaries branding, links to Sites
/login	PublicEmail/password sign-in
/sites	ProtectedSite grid — scrape, RSS URL, feed item preview
/change-password	ProtectedUpdate password with confirmation

POST /api/auth/login · logout · change-password
GET /api/auth/verify
GET /api/sites · POST /api/sites/[id]/scrape
GET /api/sites/[id]/feed-items
POST /api/scrape — quick RSS scrape for ad-hoc URLs

Sample RSS output

Feeds are generated by the backend after a successful scrape; the dashboard surfaces URLs and in-app previews.

feed.xml

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>PEI Obituaries — Example source</title>
    <link>https://example.com/obituaries</link>
    <description>Recent items from a configured scrape target</description>
    <item>
      <title>Community remembers local educator</title>
      <link>https://example.com/obituaries/educator-may-2025</link>
      <pubDate>Tue, 27 May 2025 10:00:00 GMT</pubDate>
      <description>Summary text from the scraped listing page…</description>
    </item>
    <item>
      <title>Island family announces memorial service</title>
      <link>https://example.com/obituaries/memorial-may-2025</link>
      <pubDate>Mon, 26 May 2025 14:30:00 GMT</pubDate>
    </item>
  </channel>
</rss>

Security & ops

JWT in httpOnly cookies — not exposed to client-side storage
Middleware auth gate + bot user-agent blocking (403)
BACKEND_URL server-only; browser never talks to Express directly
Docker multi-stage image for production (Node 20, port 3000)
Password change flow for authenticated users
Quick scrape BFF endpoint for ad-hoc URL tooling

This is a private production project — no public repository. Happy to walk through the BFF pattern, auth flow, and dashboard UX on a call.

Discuss this project

loading page…

RSS Scraper — frontend

Built for PEI Obituaries — Prince Edward Island news and obituary sources — with an architecture that supports any sites configured in the backend.

Next.js 16React 19TypeScriptTailwind CSS 4Radix UILucide

Architecture

System architecture

BFF keeps JWT off the client and out of localStorage

Same-origin only

Browser

React dashboard

credentials: include
/sites · /login UI
No direct API URL

httpOnly cookie

→POST /api/auth/login

←Set-Cookie: auth-token

request / response

Next.js

BFF + middleware

auth-token httpOnly cookie
/api/* route handlers
Bearer JWT on proxy

Authorization: Bearer

→GET /api/sites · POST scrape

←{ sites } · { rssUrl }

request / response

Express API

Scrape + RSS

Site config & jobs
Scrape orchestration
RSS / XML generation

Browser

React dashboard

credentials: include
/sites · /login UI
No direct API URL

httpOnly cookie

→POST /api/auth/login

←Set-Cookie: auth-token

request / response

Next.js

BFF + middleware

auth-token httpOnly cookie
/api/* route handlers
Bearer JWT on proxy

Authorization: Bearer

→GET /api/sites · POST scrape

←{ sites } · { rssUrl }

request / response

Express API

Scrape + RSS

Site config & jobs
Scrape orchestration
RSS / XML generation

Session

auth-token · 7d · httpOnly

BFF surface

/api/auth/* · /api/sites/*

Output

RSS feeds · XML on disk

Features

JWT authentication

Email/password login; session stored in an httpOnly auth-token cookie (7 days). Client state via AuthContext and /api/auth/verify.

Protected dashboard

Middleware gates every page except /login. Unauthenticated users redirect before any site data loads.

Site management

List backend-configured scraping targets with status, metadata, and last-scraped timestamps.

On-demand scraping

Trigger per-site jobs from the UI and receive updated RSS feed URLs when the backend finishes.

Feed preview

Expandable detail panels load recent items — title, link, date, description — without leaving the dashboard.

Privacy by default

robots.txt disallows crawlers; global X-Robots-Tag: noindex, nofollow; middleware blocks common bot user-agents.

Pages & BFF routes

App Router pages for auth and operations; API route handlers proxy to the backend with Authorization: Bearer <token> read from the cookie.

Route	Access
/	ProtectedLanding dashboard — PEI Obituaries branding, links to Sites
/login	PublicEmail/password sign-in
/sites	ProtectedSite grid — scrape, RSS URL, feed item preview
/change-password	ProtectedUpdate password with confirmation

POST /api/auth/login · logout · change-password

GET /api/auth/verify

GET /api/sites · POST /api/sites/[id]/scrape

GET /api/sites/[id]/feed-items

POST /api/scrape — quick RSS scrape for ad-hoc URLs

<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"> <channel> <title>PEI Obituaries — Example source</title> <link>https://example.com/obituaries</link> <description>Recent items from a configured scrape target</description> <item> <title>Community remembers local educator</title> <link>https://example.com/obituaries/educator-may-2025</link> <pubDate>Tue, 27 May 2025 10:00:00 GMT</pubDate> <description>Summary text from the scraped listing page…</description> </item> <item> <title>Island family announces memorial service</title> <link>https://example.com/obituaries/memorial-may-2025</link> <pubDate>Mon, 26 May 2025 14:30:00 GMT</pubDate> </item> </channel> </rss>

Security & ops

JWT in httpOnly cookies — not exposed to client-side storage

Middleware auth gate + bot user-agent blocking (403)

BACKEND_URL server-only; browser never talks to Express directly

Docker multi-stage image for production (Node 20, port 3000)

Password change flow for authenticated users

Quick scrape BFF endpoint for ad-hoc URL tooling