๐ค AI Web Scraping Tool with Fact-Checking
Powered by DeepSeek V3 & DeepSeek R1 via OpenRouter
Extract and analyze web content using advanced AI, then fact-check the results for accuracy and reliability.
๐ How to Use the Fact-Checking Feature:
- First: Enter your API key, URL, and analysis query
- Second: Click "๐ Analyze Website" to get initial results
- Third: Click "๐ Fact-Check Results" to verify accuracy with DeepSeek R1
๐ฏ What the Fact-Checker Does:
Accuracy Verification
- Compares every claim in the analysis against the original source
- Identifies factual errors and misrepresentations
- Verifies numerical data and statistics
Completeness Assessment
- Checks if important information was missed
- Evaluates coverage of all relevant aspects
- Identifies gaps in the analysis
Context Verification
- Ensures information isn't taken out of context
- Verifies proper interpretation of source material
- Checks for misleading presentations
Quality Scoring
- Provides accuracy scores (1-10 scale)
- Lists verified vs. unverified claims
- Offers specific recommendations for improvement
๐งช Best Practices for Fact-Checking:
Ideal Test Cases:
URL: https://en.wikipedia.org/wiki/List_of_countries_by_population
Query: Create a table showing the top 10 most populous countries with their exact population figures
Perfect for fact-checking numerical accuracy
URL: https://www.who.int/news-room/fact-sheets
Query: Extract key health statistics and create a summary of global health metrics
Great for verifying official statistics
URL: https://finance.yahoo.com/quote/AAPL
Query: Extract Apple's current stock price, market cap, and financial metrics
Excellent for checking real-time financial data accuracy
๐ฏ Example Analysis Queries for Fact-Checking:
Data-Heavy Content
- "Extract all numerical data and organize it in a table format"
- "Create a comparison table of different countries' GDP figures"
- "List the top 10 items with their exact values from the source"
Statistical Information
- "Summarize key statistics with specific numbers and percentages"
- "Extract survey results and present the exact figures"
- "Create a timeline with specific dates and events"
Complex Analysis
- "Compare different viewpoints and cite specific quotes"
- "Extract cause-and-effect relationships mentioned in the article"
- "Summarize research findings with methodology details"
๐ What Gets Fact-Checked:
โ Verified Items:
- Exact quotes and citations
- Numerical data and statistics
- Dates, names, and factual claims
- Table data accuracy
- Mathematical calculations
โ ๏ธ Flagged Issues:
- Misquoted information
- Incorrect numbers or percentages
- Missing context or nuance
- Overgeneralized statements
- Unsupported conclusions
๐จ Red Flags the Fact-Checker Catches:
- Hallucinated Data: Information not present in the source
- Misattributed Quotes: Quotes assigned to wrong sources
- Mathematical Errors: Incorrect calculations or summaries
- Context Loss: Information presented without proper context
- Incomplete Extraction: Missing important details from tables
๐ก Tips for Better Fact-Checking:
- Use Specific Queries: More specific requests = better fact-checking
- Test with Known Data: Start with sites where you know the content
- Check Complex Tables: Tables are great for testing accuracy
- Verify Names & Dates: These are common error points
- Cross-Reference: Compare with multiple sources when possible
๐ฌ Advanced Fact-Checking Tests:
Financial Data Test
URL: https://finance.yahoo.com/quote/MSFT
Query: Create a detailed financial summary table with exact figures for Microsoft stock
Expected: Fact-checker should verify all numbers match the source exactly
Statistical Data Test
URL: https://www.census.gov/quickfacts/fact/table/US
Query: Extract US population demographics with specific percentages
Expected: Fact-checker should confirm all demographic percentages are accurate
Historical Data Test
URL: https://en.wikipedia.org/wiki/List_of_Presidents_of_the_United_States
Query: Create a table of the last 10 US presidents with their exact terms of office
Expected: Fact-checker should verify all dates and names are correct
๐งช Test Scenarios
1. News & Media Sites
URL: https://www.bbc.com/news
Query: Extract the top 5 news headlines with their summaries and create a table with columns: Headline, Category, Summary
URL: https://edition.cnn.com
Query: Find all breaking news items and organize them by topic/region in a structured format
2. Financial Data Sites
URL: https://finance.yahoo.com/quote/AAPL
Query: Extract Apple stock information including current price, daily change, market cap, and any financial metrics into a summary table
URL: https://www.marketwatch.com/investing/stock/tsla
Query: Create a table with Tesla's key financial metrics: price, change, volume, market cap, P/E ratio
3. E-commerce & Product Pages
URL: https://www.amazon.com/dp/B08N5WRWNW
Query: Extract product details including name, price, ratings, key features, and specifications in a structured format
URL: https://www.ebay.com/itm/123456789
Query: Extract item details, price, seller information, and shipping details into a comparison-ready table
4. Educational & Reference Sites
URL: https://en.wikipedia.org/wiki/Artificial_intelligence
Query: Extract the main definition, history timeline, and applications of AI. Create separate sections for each topic.
URL: https://en.wikipedia.org/wiki/List_of_countries_by_population
Query: Extract the population data table and create a new table showing top 10 most populous countries with their population and growth rate
5. Government & Official Statistics
URL: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
Query: Extract the latest COVID-19 statistics and create a summary table with key global figures
URL: https://www.census.gov/quickfacts
Query: Extract key demographic statistics for the United States and organize them into categories: Population, Economy, Geography
6. Technology & Business News
URL: https://techcrunch.com
Query: Find the latest startup funding news and create a table with: Company Name, Funding Amount, Investors, Industry
URL: https://www.reuters.com/technology
Query: Extract top technology news and summarize each story in 2-3 sentences with key points
7. Scientific & Research Sites
URL: https://www.nature.com/articles
Query: Extract recent scientific article titles, authors, and abstracts. Create a summary table organized by research field
URL: https://pubmed.ncbi.nlm.nih.gov/trending
Query: Find trending medical research topics and create a list with brief descriptions of each study's findings
8. Sports & Entertainment
URL: https://www.espn.com/nba/standings
Query: Extract NBA team standings and create a table with: Team, Wins, Losses, Win Percentage, Conference Position
URL: https://www.imdb.com/chart/top
Query: Extract the top 10 movies from IMDb's top 250 list with ratings, year, and brief description
9. Weather & Environmental Data
URL: https://weather.com/weather/today
Query: Extract current weather conditions and forecast data. Create a summary with temperature, conditions, and weekly outlook
10. Real Estate & Property
URL: https://www.zillow.com/homes/for_sale
Query: Extract property listings with prices, locations, square footage, and key features into a comparison table
๐ฏ Quick Test Samples (Copy & Paste Ready)
Simple Test:
URL: https://httpbin.org/html
Query: Extract all text content and identify the page structure
Table Extraction Test:
URL: https://www.w3schools.com/html/html_tables.asp
Query: Find all HTML tables on this page and convert them to a structured format with proper headers
Complex Analysis Test:
URL: https://www.sec.gov/edgar/browse/?CIK=320193
Query: Extract Apple Inc.'s recent SEC filings and create a table with: Filing Date, Document Type, Description
International Site Test:
URL: https://www.bbc.co.uk/weather
Query: Extract UK weather information and create a regional breakdown of current conditions
๐ฏ Interpreting Fact-Check Results:
Accuracy Scores:
- 9-10: Highly accurate, minimal issues
- 7-8: Generally accurate with minor corrections needed
- 5-6: Moderate accuracy, several issues to address
- 3-4: Low accuracy, significant problems found
- 1-2: Poor accuracy, major fact-checking failures
Verification Status:
- โ VERIFIED: Claim matches source exactly
- โ ๏ธ PARTIALLY VERIFIED: Claim is mostly correct but lacks nuance
- โ CANNOT VERIFY: Claim not supported by source material
- ๐จ CONTRADICTED: Claim directly contradicts source
Remember: The fact-checker is designed to be thorough and critical. Even high-quality analyses may receive suggestions for improvement!