Building an Automated Lead Generation Tool with Python: A Web Scraping Journey
In today’s digital age, finding business leads often means manually searching through websites and social media pages — a tedious and time-consuming process. Let me show you how I built Lead Hunter, a Python-based tool that automates this process by extracting valuable business information from websites and their associated Facebook pages.
Idea behind the product
Most of the websites like Google maps and directories like yelp which contains businesses information, don’t list contact info which is crucial for lead-generation business. As a result, teams responsible for lead-generation has to put a lot of manual effort and time to get the contact info from these websites.
The good thing is that Google maps and other business directories do list the business website. So, I have designed a tool that would take the business website (domain) as input and scrape contact info (and some other useful info)
The Technical Stack
Lead Hunter is built with Python, leveraging powerful libraries:
Selenium
for handling dynamic web content and some automation, which is required on FacebookBeautifulSoup
for HTML parsingApify
platform for scalable web scraping and deploymentPandas
for data organization
How It Works
The tool follows a straightforward workflow:
- Takes a list of business website URLs (domains) as input
- For each website:
- Extracts business details (name, email, phone, address) on the homepage
- Finds the associated Facebook page url if available in the source code of the homepage
- Scrapes the same contact information from Facebook and some additional info not available on the website (like the Business type)
- Combines and validates the data based on format of the scraped fields
- Exports structured results
Here’s the core function that orchestrates this process:
def process_url(url, driver):
Actor.log.info(f"Processing {url}")
content = fetch_url(url)
info = {'url': url}
fb_link = get_fb_link(content)
if fb_link:
info_from_fb = extract_info_from_fb(fb_link, driver)
info.update(info_from_fb)
Sample Output
The tool generates a structured output with comprehensive business information. You can get the output in both the csv
or the json
format from apify
. Here’s what the extracted data looks like:

Key Features
- Smart Data Extraction: The tool doesn’t just scrape — it intelligently combines data from both website and Facebook sources.
- Address Validation: Using AI, it validates and standardizes business addresses to ensure accuracy.
- Multiple Sources: The contact info is scraped from both the website and its associated Facebook page. This gives a more complete information about the business as compared to a single source and also allows the user to choose between the info scraped from the Facebook page and the website for a particular business (Note: Info from Facebook is of higher quality)
Challenges and Solutions
The main challenges were scraping addresses and phone numbers from websites without knowing the website structure. These problems are solved by:
For Address
- Dual Source Verification: We extract addresses from both the business website and Facebook page, as Facebook tends to have more standardized address formats.
2. AI-Powered Validation:
- Process multiple address candidates from the business website
- Use AI to identify valid address patterns among the matched ones
- Select the most likely legitimate business address
- Standardize the format for consistency
For Phone Number
- The
phonenumbers
library is used to find phone numbers in the source code of the website. Once, a phone number is identified by this library on the website, its functionality of parsing a number further validates the phone number. - Dual source verification: There is still a significant number of false positives when scraped from the website but almost no false positive when scraped from Facebook
Next Steps
- Adding navigation to Contact/About pages
- Improving text extraction quality
- Adding more data validation rules
- Adding support for more countries (currently its USA, Canada, UK)
Want to Try It?
Check out the tool here: lead hunter
This tool demonstrates how Python’s ecosystem can transform manual business processes into efficient automated workflows. Whether you’re in sales, marketing, or business development, having automated lead generation can give you a significant competitive advantage.