Introduction

In today’s competitive job market, efficiently aggregating job listings from multiple sources can be a game-changer for job seekers. In this article, I’ll walk you through how I built a professional job scraping application that extracts data from Indeed.com using modern technologies like React.js, FastAPI, and SeleniumBase.

What You’ll Learn:

Building a RESTful API with FastAPI
Web scraping with SeleniumBase CDP mode
Creating a React frontend with real-time updates
Bypassing anti-bot detection systems
Implementing multi-format data export (CSV, Excel, JSON)

The Problem I Solved

Job hunting is time-consuming. Manually browsing through hundreds of listings, keeping track of companies, and organizing applications is tedious. I wanted to create a tool that:

✅ Automates job listing collection from Indeed.com

✅ Provides real-time analytics on job postings

✅ Exports data in multiple formats for easy analysis

✅ Bypasses anti-scraping mechanisms reliably

Technology Stack & Architecture

Backend: Python FastAPI

I chose FastAPI for several reasons:

Lightning-fast performance with async support
Automatic API documentation (Swagger UI)
Type safety with Pydantic models
Modern Python features (async/await)

Frontend: React.js

React provides:

Component-based architecture for maintainability
State management for real-time updates
Responsive UI for all devices
Rich ecosystem of libraries

Scraping Engine: SeleniumBase

SeleniumBase with CDP mode offers:

Anti-bot bypass capabilities
Reliable browser automation
Chrome DevTools Protocol for direct control
Undetected mode for stealth scraping

Architecture Diagram

User Interface (React)
        ↓
    API Layer (FastAPI)
        ↓
Scraping Engine (SeleniumBase)
        ↓
    Indeed.com
        ↓
    Data Processing
        ↓
Export (CSV/Excel/JSON)

Key Features Implemented

1. Smart Job Scraping

The scraper intelligently navigates Indeed’s search results, handling pagination and extracting:

Job titles
Company names
Locations (including remote positions)
Direct job URLs
Posting dates

2. Anti-Bot Protection Bypass

Using SeleniumBase’s CDP mode, the scraper:

Runs in undetected mode
Mimics human behavior
Handles dynamic content loading
Manages rate limiting

Code Snippet:

from seleniumbase import SB

with SB(uc=True, headless=True) as sb:
    sb.open(search_url)
    sb.wait_for_element(".job_seen_beacon")
    jobs = sb.find_elements(".job_seen_beacon")

3. Real-Time Analytics

The application provides instant insights:

Total jobs found
Number of unique companies
Unique locations
Top hiring companies
100% link validation

4. Multi-Format Export

Users can download data in three formats:

CSV: Excel-compatible, perfect for spreadsheets
Excel: Native .xlsx format with preserved formatting
JSON: Developer-friendly, ideal for API integration

Technical Implementation

Backend API Structure

main.py – FastAPI application with CORS configuration:

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI(title="Indeed Job Scraper API")

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.post("/scrape")
async def scrape_jobs(request: ScrapeRequest):
    # Scraping logic here
    pass

scraper.py – Core scraping logic:

def scrape_indeed(job_title: str, location: str, pages: int):
    jobs = []
    
    for page in range(pages):
        # Navigate to search results
        # Extract job data
        # Validate links
        pass
    
    return jobs

models.py – Pydantic data validation:

from pydantic import BaseModel

class ScrapeRequest(BaseModel):
    job_title: str
    location: str
    pages: int

class JobListing(BaseModel):
    title: str
    company: str
    location: str
    url: str

Frontend Component Structure

JobScraper.js – Search form:

const JobScraper = () => {
  const [jobTitle, setJobTitle] = useState('');
  const [location, setLocation] = useState('');
  const [pages, setPages] = useState(1);
  
  const handleSubmit = async (e) => {
    e.preventDefault();
    const response = await api.scrapeJobs({
      job_title: jobTitle,
      location,
      pages
    });
    // Handle response
  };
  
  return (
    <form onSubmit={handleSubmit}>
      {/* Form fields */}
    </form>
  );
};

JobTable.js – Results display with filtering:

const JobTable = ({ jobs }) => {
  const [filter, setFilter] = useState('');
  
  const filteredJobs = jobs.filter(job =>
    job.title.toLowerCase().includes(filter.toLowerCase()) ||
    job.company.toLowerCase().includes(filter.toLowerCase())
  );
  
  return (
    <div>
      <input 
        placeholder="Filter jobs..."
        onChange={(e) => setFilter(e.target.value)}
      />
      <table>
        {filteredJobs.map(job => (
          <tr key={job.url}>
            <td>{job.title}</td>
            <td>{job.company}</td>
            <td>{job.location}</td>
          </tr>
        ))}
      </table>
    </div>
  );
};

Challenges & Solutions

Challenge 1: Anti-Bot Detection

Problem: Indeed implements sophisticated bot detection

Solution: Used SeleniumBase CDP mode with undetected Chrome driver

Challenge 2: Dynamic Content Loading

Problem: Job listings load asynchronously

Solution: Implemented smart waiting strategies with explicit waits

Challenge 3: Data Consistency

Problem: Inconsistent HTML structure across job listings

Solution: Created robust parsing with fallback mechanisms

Challenge 4: Performance

Problem: Scraping multiple pages was slow

Solution: Optimized selectors and implemented efficient data extraction

Results & Performance

Scraping Speed: ~15-20 jobs per page in 5-7 seconds

Success Rate: 100% link validation

Reliability: CDP mode ensures consistent results

Scalability: Can handle 1-10 pages per search

Analytics Provided:

Real-time job count
Company distribution
Location insights
Top hiring companies

Screenshots

Main Interface

Results Dashboard

Export Options

Installation & Usage

Quick Start

# Clone the repository
git clone https://github.com/seotanvirbd/indeed_fastapi_reactjs_scraper_app.git

# Backend setup
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python run.py

# Frontend setup (new terminal)
cd frontend
npm install
npm start

Usage Example

Enter job title: “Software Engineer”
Set location: “Remote”
Choose pages: 3
Click “START SCRAPING”
View results with analytics
Export in preferred format

Lessons Learned

Technical Insights

FastAPI is incredibly fast – The async nature makes it perfect for I/O-bound tasks
SeleniumBase CDP mode is powerful – Reliable bot detection bypass
React’s component model scales well – Easy to maintain and extend
Type safety matters – Pydantic caught many bugs during development

Development Practices

Start with MVP – Built core scraping first, added features incrementally
Test thoroughly – Multiple test runs on different searches
Handle errors gracefully – Robust error handling prevents crashes
Document as you go – Made future development easier

Ethical Considerations

Important: This tool is for educational and personal use only. When scraping:

✅ Respect robots.txt
✅ Implement rate limiting
✅ Review Terms of Service
✅ Use data responsibly
❌ Don’t overwhelm servers
❌ Don’t violate privacy

Conclusion

Building this full-stack job scraping application taught me valuable lessons about:

Modern API development with FastAPI
Advanced web scraping techniques
React state management
Bypassing anti-bot systems ethically
Creating user-friendly interfaces

The project demonstrates proficiency in:

Backend Development (Python, FastAPI, async programming)
Frontend Development (React, JavaScript, responsive design)
Web Scraping (SeleniumBase, CDP mode, anti-detection)
API Design (RESTful principles, documentation)
Data Processing (CSV, Excel, JSON export)

Resources & Links

GitHub Repository: https://github.com/seotanvirbd/indeed_fastapi_reactjs_scraper_app

Get In Touch

Found this helpful? Have questions or suggestions?

GitHub: @seotanvirbd
LinkedIn: https://www.linkedin.com/in/seotanvirbd/
Email: tanvirafra1@gmail.com

⭐ Star the repository if you find it useful!

Tags: #Python #FastAPI #React #WebScraping #SeleniumBase #FullStack #API #JobSearch #Automation #WebDevelopment

This article is part of my portfolio showcasing full-stack development skills. Check out my other projects on GitHub.

Building a Full-Stack Job Scraping Application: React, FastAPI & SeleniumBase

Introduction

The Problem I Solved

Technology Stack & Architecture

Backend: Python FastAPI

Frontend: React.js

Scraping Engine: SeleniumBase

Architecture Diagram

Key Features Implemented

1. Smart Job Scraping

2. Anti-Bot Protection Bypass

3. Real-Time Analytics

4. Multi-Format Export

Technical Implementation

Backend API Structure

Frontend Component Structure

Challenges & Solutions

Challenge 1: Anti-Bot Detection

Challenge 2: Dynamic Content Loading

Challenge 3: Data Consistency

Challenge 4: Performance

Results & Performance

Screenshots

Main Interface

Results Dashboard

Export Options

Installation & Usage

Quick Start

Usage Example

Lessons Learned

Technical Insights

Development Practices

Ethical Considerations

Conclusion

Resources & Links

Get In Touch

Leave a Reply Cancel reply

Introduction

The Problem I Solved

Technology Stack & Architecture

Backend: Python FastAPI

Frontend: React.js

Scraping Engine: SeleniumBase

Architecture Diagram

Key Features Implemented

1. Smart Job Scraping

2. Anti-Bot Protection Bypass

3. Real-Time Analytics

4. Multi-Format Export

Technical Implementation

Backend API Structure

Frontend Component Structure

Challenges & Solutions

Challenge 1: Anti-Bot Detection

Challenge 2: Dynamic Content Loading

Challenge 3: Data Consistency

Challenge 4: Performance

Results & Performance

Screenshots

Main Interface

Results Dashboard

Export Options

Installation & Usage

Quick Start

Usage Example

Lessons Learned

Technical Insights

Development Practices

Ethical Considerations

Conclusion

Resources & Links

Get In Touch

Please Share This Share this content

You Might Also Like

Step-by-Step Guide to Python Decorators: Learn with Clear and Simple Examples

How I Built a Local RAG App for PDF Q&A | Streamlit | LLAMA 3.x | 2025

Universal Web Scraper Chrome Extension: The Ultimate Data Extraction Tool for 2025

Leave a Reply Cancel reply

Share this content